PHP xml to array - 如何摆脱空标签?

There is the same problem at xml to array - remove empty array php Dont know how you handle this. I mean how can i get an answer to a question that is not mine and asked > 2 years ago. So im asking here my own question:

Simple script:

$xml
    = '<?xml version="1.0"?>
       <Envelope>
           <foo>
               <bar>
                   <baz>Hello</baz>
                   <bat/>
               </bar> 
           </foo>
           <foo>
               <bar>
                   <baz>Hello Again</baz>
                   <bat></bat>
               </bar>
           </foo>
           <foo>
               <bar>
                   <baz>Hello Again</baz>
                   <bat> </bat>
               </bar>
           </foo>
       </Envelope>';

$xml = new \SimpleXMLElement(
    $xml,
    LIBXML_NOBLANKS | LIBXML_NOEMPTYTAG | LIBXML_NOCDATA
);
$array = json_decode(json_encode((array)$xml), true);
// [
//     'foo' => [
//         0 => [
//             'bar' => [
//                 'baz' => 'Hello',
//                 'bat' => [], <<-- how to get this to NULL
//             ],
//         ],
//         1 => [
//             'bar' => [
//                 'baz' => 'Hello Again',
//                 'bat' => [], <<-- how to get this to NULL
//             ],
//         ],
//         2 => [
//             'bar' => [
//                 'baz' => 'Hello Again',
//                 'bat' => [   <<-- how to get this to NULL
//                     0 => ' ',     or at least to value of " " without array
//                 ],
//             ],
//         ],
//     ],
// ];

As you can see there is an empty <bat/> tag and a whitespace in the last <bat> </bat> tag.

I would like to get those to null in the array.

I tried the following but this works for the first level only ofc:

$data = (array)$xml;
foreach ($data as &$item) {
    if (
        $item instanceof \SimpleXMLElement
        and $item->count() === 0
    ) {
        // is a object(SimpleXMLElement)#1 (0) {}
        $item = null; 
    }
}

I tried and failed doing this recursively.

Also tried RecursiveIteratorIterator but failed.

But there must be a way to get those offset to null.

Anybody done this before?

EDIT

Solved. See https://stackoverflow.com/a/55733384/3411766

dp7311
dp7311 每当我看到有人试图编写一个通用的XML-to-array函数时,特别是当它以像$array=json_decode(json_encode((array)$xml)(j)开始时,我绝望了。在通用算法给出“错误”答案的所有情况下,使用XML解析器(如SimpleXML)访问实际需要的数据,并创建数组(或预定义对象),而不是跳过特殊情况这对你的实际应用有意义。
一年多之前 回复

2个回答

Found it out my self. Took a while but works perfectly.

/** 
 * @param array|\SimpleXMLElement[]|\SimpleXMLElement $data .
 *
 * @return array
 */
protected function emptyNodesToNull($data)
{
    if ($data instanceof \SimpleXMLElement and $data->count() === 0) {
        // is empty object like
        //  SimpleXMLElement::__set_state(array())
        //  which was f.e. a <foo/> tag
        // or
        //  SimpleXMLElement::__set_state(array(0 => ' ',))
        //  which was f.e. a <foo> </foo> (with white space only)
        return null;
    }
    $data = (array)$data;
    foreach ($data as &$value) {
        if (is_array($value) or $value instanceof \SimpleXMLElement) {
            $value = $this->emptyNodesToNull($value);
        } else {
            // $value is the actual value of a node.
            // Could do further checks here.
        }
    }
    return $data;
}

My tests did exactly what i expected

and returns imo exactly what you can expect from a xmlToArray method.

I mean we wont be able to handle attributes, but this is not the requirement.

Test:

    $xml
        = '<?xml version="1.0"?>
   <Envelope>
       <a/><!-- expecting null -->
       <foo>
           <b/><!-- expecting null -->
           <bar>
               <baz>Hello</baz>

               <!-- expecting here an array of 2 x null -->
               <c/>
               <c/>

           </bar> 
       </foo>
       <foo>
           <bar>
               <baz>Hello Again</baz>
               <d>    </d><!-- expecting null -->
               <item>
                   <firstname>Foo</firstname>
                   <email></email><!-- expecting null -->
                   <telephone/><!-- expecting null -->
                   <lastname>Bar</lastname>
               </item>
               <item>
                   <firstname>Bar</firstname>
                   <email>0</email><!-- expecting value 0 (zero) -->
                   <telephone/><!-- expecting null -->
                   <lastname>Baz</lastname>
               </item>

               <!-- expecting array of values 1, 2 null, 4 -->
               <number>1</number>
               <number>2</number>
               <number></number>
               <number>4</number>
           </bar>
       </foo>
   </Envelope>';

$xml = new \SimpleXMLElement($xml);
$array = $class::emptyNodesToNull($xml);

Returns:

[
    'Envelope' => [
        'a'   => null,
        'foo' => [
            0 => [
                'b'   => null,
                'bar' => [
                    'baz' => 'Hello',
                    'c'   => [
                        0 => null,
                        1 => null,
                    ],
                ],
            ],
            1 => [
                'bar' => [
                    'baz'    => 'Hello Again',
                    'd'      => null,
                    'item'   => [
                        0 => [
                            'firstname' => 'Foo',
                            'email'     => null,
                            'telephone' => null,
                            'lastname'  => 'Bar',
                        ],
                        1 => [
                            'firstname' => 'Bar',
                            'email'     => '0',
                            'telephone' => null,
                            'lastname'  => 'Baz',
                        ],
                    ],
                    'number' => [
                        0 => '1',
                        1 => '2',
                        2 => null,
                        3 => '4',
                    ],
                ],
            ],
        ],
    ],
];

You can use XPath with the predicate not(node()) to select all elements that do not have child nodes.

<?php

$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadxml('<?xml version="1.0"?>
       <Envelope>
           <foo>
               <bar>
                   <baz>Hello</baz>
                   <bat/>
               </bar>
           </foo>
           <foo>
               <bar>
                   <baz>Hello Again</baz>
                   <bat></bat>
               </bar>
           </foo>
           <foo>
               <bar>
                   <baz>Hello Again</baz>
                   <bat></bat>
               </bar>
           </foo>
       </Envelope>');

$xpath = new DOMXPath($doc);

foreach( $xpath->query('//*[not(node())]') as $node ) {
    $node->parentNode->removeChild($node);
}

$doc->formatOutput = true;
echo $doc->savexml();

Print:

<?xml version="1.0"?>
<Envelope>
  <foo>
    <bar>
      <baz>Hello</baz>
    </bar>
  </foo>
  <foo>
    <bar>
      <baz>Hello Again</baz>
    </bar>
  </foo>
  <foo>
    <bar>
      <baz>Hello Again</baz>
    </bar>
  </foo>
</Envelope>

Regards!

douyan6742
douyan6742 像@NigelRen一样回答:这会删除偏移量。 如发布我想让它们为空。 因此{offset}未设置但存在。 与xml相同(存在但为空)
一年多之前 回复
duanmoen784988
duanmoen784988 不确定最后一个元素<bat> </ bat>在OP的版本中是否有空格。
一年多之前 回复
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问