douzhuo6270 2009-01-01 00:50
浏览 17

使用PHP + SimpleXML进行刮擦...我可以抓取图像而不是原始文本?

I'm trying to grab a specific bit of raw text from a web site. Using this site and other sources, I learned how to grab specific images using simpleXML and xpath.

However the same approach doesn't appear to be working for grabbing raw text. Here's what's NOT working right now.

// first I set the xpath of the div that contains the text I want
$xpath = '//*[@id="storyCommentCountNumber"]';

// then I create a new DOM Document
$html = new DOMDocument();

// then I fetch the file and parse it (@ suppresses warnings).
@$html->loadHTMLFile($url);

// then convert DOM to SimpleXML
$xml = simplexml_import_dom($html);   

// run an XPath query on the div I want using the previously set xpath
$commcount = $xml->xpath($xpath);
print_r($commcount);

Now when I'm grabbing an image, that commcount object would return an array that contains the images source in it somewhere.

In this case, I want that object to return the raw text contained in the "storyCommentCountNumber" div. But that text doesn't appear to be contained in the object, just the name of the Div.

What am I doing wrong? I can kind of see that this approach is only for grabbing HTML elements and the bits inside of them, not raw text. How do I get the text inside that div?

Thanks!

  • 写回答

5条回答 默认 最新

  • dongwen7813 2009-01-01 01:12
    关注

    Try checking this page out.

    :)

    评论

报告相同问题?

悬赏问题

  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 关于大棚监测的pcb板设计
  • ¥15 stm32开发clion时遇到的编译问题
  • ¥15 lna设计 源简并电感型共源放大器
  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)
  • ¥15 Vue3地图和异步函数使用