duanchi0897 2013-04-30 19:01
浏览 61
已采纳

使用DomDocument将所有标头标签替换为h4标签

I've used DomDocument to to GetElementById. It has selected a div. I need to replace all the header tags within that div with the h4 tag.

  • 写回答

1条回答 默认 最新

  • dongli1920 2013-05-01 08:15
    关注

    You have not made clear in your question what the concrete problem is you run into. I would assume that there are two parts that could cause you some questions marks.

    The first one would be how to get the hand on all the elements that you want to rename and the second one is actually how to rename an element.

    Get Heading Elements of a DOMDocument

    So first things first: To select all the header elements you need to select all tags that are Heading elements (h1 to h6). Combined with the condition that they also need to be children of the div tag with a specific id attribute this seems like a rather complicate thing to do. However with an xpath query, it is still merely straight forward.

    Exemplary for my code examples I have choosen the id `"content" and the following xpath expression queries all heading elements:

    (
        //div[@id="content"]//h1
        |//div[@id="content"]//h2
        |//div[@id="content"]//h3
        |//div[@id="content"]//h4
        |//div[@id="content"]//h5
        |//div[@id="content"]//h6
    )
    

    If I run this on this website here (before I answered it), it creates the following listing of tags:

    Found 8 elements:
     #00: <h1>
     #01: <h2>
     #02: <h2>
     #03: <h3>
     #04: <h3>
     #05: <h3>
     #06: <h2>
     #07: <h4>
    

    As this demonstrates well, with an xpath query even a list of different elements and with specific conditions like being a child of the div with the id can be created. This code at a glance:

    $url = 'http://stackoverflow.com/questions/16307103/use-domdocument-to-replace-all-header-tags-with-the-h4-tags';
    
    $dom = new DOMDocument();
    $internalErrorsState = libxml_use_internal_errors(true);
    $dom->loadHTMLFile($url);
    libxml_use_internal_errors($internalErrorsState);
    $xpath = new DOMXPath($dom);
    
    $expression = '
    (
        //div[@id="content"]//h1
        |//div[@id="content"]//h2
        |//div[@id="content"]//h3
        |//div[@id="content"]//h4
        |//div[@id="content"]//h5
        |//div[@id="content"]//h6
    )';
    
    $elements = $xpath->query($expression);
    echo "Found ", $elements->length, " elements:
    ";
    foreach ($elements as $index => $element) {
        printf(" #%02d: <%s>
    ", $index, $element->tagName);
    }
    

    Renaming a DOMElement

    So what about the second problem, about renaming elements?

    DOMDocumet out of the box does not support this. There is a method stub (DOMDocument::renameNode(); undocumented in the current PHP manual) but if you call it you get a warning that it is not implemented:

    Warning: DOMDocument::renameNode(): Not yet implemented

    Instead one needs to roll her own version. And this is how it works: As you can not rename an element with DOMDocument, all you can do is to create a new element with the renamed name and copy the node to rename all its attributes and children into it and then replace it with the renamed shallow copy. This is done by the following method:

    /**
     * Renames a node in a DOM Document.
     *
     * @param DOMElement $node
     * @param string     $name
     *
     * @return DOMNode
     */
    function dom_rename_element(DOMElement $node, $name) {
        $renamed = $node->ownerDocument->createElement($name);
    
        foreach ($node->attributes as $attribute) {
            $renamed->setAttribute($attribute->nodeName, $attribute->nodeValue);
        }
    
        while ($node->firstChild) {
            $renamed->appendChild($node->firstChild);
        }
    
        return $node->parentNode->replaceChild($renamed, $node);
    }
    

    Bringing this together with the foreach loop from above, next to outputting the tag-names, they can also be renamed:

    $elements = $xpath->query($expression);
    echo "Found ", $elements->length, " elements:
    ";
    foreach ($elements as $index => $element) {
        printf(" #%02d: <%s>
    ", $index, $element->tagName);
        dom_rename_element($element, 'h4');
        ###################################
    }
    

    And then afterwards, querying the xpath expression again, will result in h4 tags only:

    $elements = $xpath->query($expression);
    echo "Found ", $elements->length, " elements:
    ";
    foreach ($elements as $index => $element) {
        printf(" #%02d: <%s>
    ", $index, $element->tagName);
    }
    

    Output:

    Found 8 elements:
     #00: <h1>
     #01: <h2>
     #02: <h2>
     #03: <h3>
     #04: <h3>
     #05: <h3>
     #06: <h2>
     #07: <h4>
    

    Full Code-Example

    Here the full code-example and its output at a glance:

    <?php
    /**
     * Use DomDocument to replace all header tags with the h4 tags
     * @link http://stackoverflow.com/q/16307103/367456
     */
    $url = 'http://stackoverflow.com/questions/16307103/use-domdocument-to-replace-all-header-tags-with-the-h4-tags';
    
    $dom = new DOMDocument();
    $internalErrorsState = libxml_use_internal_errors(true);
    $dom->loadHTMLFile($url);
    libxml_use_internal_errors($internalErrorsState);
    $xpath = new DOMXPath($dom);
    
    $expression = '
    (
        //div[@id="content"]//h1
        |//div[@id="content"]//h2
        |//div[@id="content"]//h3
        |//div[@id="content"]//h4
        |//div[@id="content"]//h5
        |//div[@id="content"]//h6
    )';
    
    $elements = $xpath->query($expression);
    echo "Found ", $elements->length, " elements:
    ";
    foreach ($elements as $index => $element) {
        printf(" #%02d: <%s>
    ", $index, $element->tagName);
        dom_rename_element($element, 'h4');
    }
    
    $elements = $xpath->query($expression);
    echo "Found ", $elements->length, " elements:
    ";
    foreach ($elements as $index => $element) {
        printf(" #%02d: <%s>
    ", $index, $element->tagName);
    }
    
    /**
     * Renames a node in a DOM Document.
     *
     * @param DOMElement $node
     * @param string     $name
     *
     * @return DOMNode
     */
    function dom_rename_element(DOMElement $node, $name) {
        $renamed = $node->ownerDocument->createElement($name);
    
        foreach ($node->attributes as $attribute) {
            $renamed->setAttribute($attribute->nodeName, $attribute->nodeValue);
        }
    
        while ($node->firstChild) {
            $renamed->appendChild($node->firstChild);
        }
    
        return $node->parentNode->replaceChild($renamed, $node);
    }
    

    If you try it out, you might notice that now, after my answer, the number of heading elements has changed. I hope this is helpful!

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题