duanjiu4498 2014-08-22 11:41
浏览 13

simple_html_dom无法按预期工作

$html = new \simple_html_dom();
$html -> load_file('h*ttp://xxx.com/article.html');
$res = $html->find('div[id=content]',0)->find('p');

$arr = array();//result set
foreach($res as $v){
    $arr[] = strip_tags($v->plaintext);
}
print_r($arr);//print

I want to scrap content from a webpage,the content is encapsulated in the <div> with ID valued 'content',now,I retrieve every paragraph enclosed with <p>,there are actually another tag <figure> in the div,finally I got results with both <p> And <figure>,<figure> should not be there and what is wrong with me?

DOM structure

div id= content p p figure p figure p p div

  • 写回答

1条回答 默认 最新

  • dshyu6866 2014-08-22 11:56
    关注

    Would this work?

    $res = $html->find('#content p');
    
    评论

报告相同问题?

悬赏问题

  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100