dongrongdao8902 2014-10-22 22:18
浏览 54
已采纳

PHP html字符串到DOMDocument没有返回每个元素的数组

I'm trying to create a function which converts an HTMl string to a multidimensional array where the parent array is the tag and the children are the attributes, but if I print_r() my function it doesn't return every element.

The string is originaly a part of a big object and looks like this:

Array
(
  [0] => stdClass Object
   (          
    [html] => 
      <input type="radio" name="radio1" value="18" checked="checked" id="80">
      <label class="other" for="80">Label for radio 1</label>
      <input type="radio" name="radio2" value="20" id="81">
      <label class="other" for="81">Label for radio 2</label>
   )

  [1] => stdClass Object
   (
    [html] => 
      <input type="radio" name="radio3" value="19" checked="checked" id="91">
      <label class="other" for="91">Label for radio 3</label>
      <input type="radio" name="radio4" value="21" id="92">
      <label class="other" for="92">Label for radio 4</label>
   )

)

and this is my function:

<?php
function htmltoarray($param){
    $doc = new DOMDocument();
        $doc->loadHTML($param);
        $doc->preserveWhiteSpace = false;
        $html = $doc->getElementsByTagName('*');        
        $form = array();
        foreach($html as $v){           
            $tag = $v->nodeName;
            $val = $v->nodeValue;           
            foreach($v->attributes as $k => $a){
                $form[$tag]['txt'] = utf8_decode($val);
                $form[$tag][$k] = $a->nodeValue;
            }
        }   
    return $form;
}

// AND I CALL THE FUNCTION HERE:
foreach($myobject as $formelement){
  $convertthis = $formelement->html;
  echo '<pre>'; print_r(htmltoarray($convertthis)); echo '</pre>';
}
?>

and this returns this:

<pre>Array
(
 [input] => Array
   (
     [txt] => 
     [id] => 80
     [checked] => checked
     [type] => radio
     [value] => 20
     [name] => radio1
   )

 [label] => Array
   (
     [txt] => Label for radio 1
     [for] => 80
     [class] => other
  )

)
</pre>

<pre>Array
(
 [input] => Array
   (
     [txt] => 
     [id] => 92
     [checked] => checked
     [type] => radio
     [value] => 21
     [name] => radio4
   )

 [label] => Array
   (
     [txt] => Label for radio 4
     [for] => 92
     [class] => other
   )

)
</pre>

As you see it returns the the first two elements from the first string and the last two from the second string.

What am I missing? Why is this strange retun and how can I fix it to return every element?

  • 写回答

1条回答 默认 最新

  • dousong1926 2014-10-23 00:28
    关注

    The values inside your array are overwritten, thus getting only the last value. Create a temporary grouping array first. Then merge them and push inside.

    I improved it a little bit:

    function htmltoarray($param){
        $doc = new DOMDocument();
        $doc->loadHTML($param);
        $doc->preserveWhiteSpace = false;
        // get body children
        $html = $doc->getElementsByTagName('body')->item(0)->childNodes;
        $form = array();
        foreach($html as $v){
            if(get_class($v) != 'DOMText') { // disregard Text nodes
                $tag = $v->nodeName;
                $val = $v->nodeValue;
                $attrs = array();
                foreach($v->attributes as $k => $a){ // gather all attributes
                    $attrs[$k] = $a->nodeValue;
                }
                // merge dom text value with attributes
                $element = array_merge(array('txt' => utf8_decode($val)), $attrs);
                $form[$tag][] = $element; // push them inside with another dimension
                        // ^ this one
            }
        }
    
        return $form;
    }
    
    // AND I CALL THE FUNCTION HERE:
    foreach($myobject as $formelement){
      $convertthis = $formelement->html;
      echo '<pre>'; print_r(htmltoarray($convertthis)); echo '</pre>';
    }
    

    Sample Output

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 关于大棚监测的pcb板设计
  • ¥15 stm32开发clion时遇到的编译问题
  • ¥15 lna设计 源简并电感型共源放大器
  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)
  • ¥15 Vue3地图和异步函数使用
  • ¥15 C++ yoloV5改写遇到的问题