douya6606 2015-04-05 22:15
浏览 63
已采纳

PHP提取html标签和内容[重复]

This question already has an answer here:

I have:

<html>
<head>
    <title>My Page</title>
</head>
<body>
    <p>paragraph 1</p>
    <p>paragraph 2</p>
    <p>paragraph 3</p>
    <p>paragraph 4</p>
    <ul>
        <li>item # 1</li>
        <li>item # 2</li>
        <li>item # 3</li>
        <li>item # 4</li>
    </ul>
    <a href="#">anchor 1</a>
    <a href="#">anchor 2</a>
    <a href="#">anchor 3</a>
    <a href="#">anchor 4</a>
    <div>div # 1</div>
    <div>div # 2</div>
    <div>div # 3</div>
    <div>div # 4</div>
</body>
</html>

I want to be able to extract a specified tag, lets say a div tag, and it's contents.

So far I have

$file = file_get_contents('file.html');
$dom = new DOMDocument();
$dom->loadHTML( $file );
$xpath = new DOMXpath( $dom );
$paragraphs = $xpath->query("/html/body//p");

for( $i = 0; $i < $paragraphs->length; $i++ )
{
     # echo the tag and it's contents
}

I tried using nodeValue or textContent but they just print the content of the tag and not the tags plus their content.

This is my first time using the DOM parser in PHP. I know that the use of regexes to parse HTML/XML is protested against so I am using the DOM parser. Any suggestions would help.

</div>
  • 写回答

1条回答 默认 最新

  • doz15449 2015-04-05 22:28
    关注

    This should work for PHP version 5.3.6+. Just pass the node to the DOMDocument::saveHTML function.

    for( $i = 0; $i < $paragraphs->length; $i++ )
    {
         echo $dom->saveHTML($paragraph->item($i));
    }
    

    I hope this helps!

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 乌班图ip地址配置及远程SSH
  • ¥15 怎么让点阵屏显示静态爱心,用keiluVision5写出让点阵屏显示静态爱心的代码,越快越好
  • ¥15 PSPICE制作一个加法器
  • ¥15 javaweb项目无法正常跳转
  • ¥15 VMBox虚拟机无法访问
  • ¥15 skd显示找不到头文件
  • ¥15 机器视觉中图片中长度与真实长度的关系
  • ¥15 fastreport table 怎么只让每页的最下面和最顶部有横线
  • ¥15 java 的protected权限 ,问题在注释里
  • ¥15 这个是哪里有问题啊?