dongyou5271 2015-07-11 13:42
浏览 74

PHP loadHTML错误解决方案会产生xPath查询问题

I had a problem with loading HTML including null bytes, and I applied the bug fix as shown here: PHP DOM loadHTML() method unusual warning

The thing is that now, any query I do on that "fixed" HTML will give no results at all.

This is what I do:

$opts = array('http' => array('header' => 'Accept-Charset: UTF-8, *;q=0'));
$context = stream_context_create($opts);
$html=file_get_contents('http://actualidad.rt.com/ultima_hora',false,$context);
$html=mb_convert_encoding($html, 'UTF-8', mb_detect_encoding($html, 'UTF-8, ISO-8859-1', true));
$html=str_replace("\0", '', $html); //Avoid PHP BUG https://stackoverflow.com/questions/30925533/php-dom-loadhtml-method-unusual-warning
$this->dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath=new DOMXPath($this->dom);
$COUNTDIVS=$xpath->query('//div');

$COUNTDIVS has zero elements, while the real HTML has a whole bunch of div tags.

And, the code is working fine with websites where the bug doesn't apply.

How could I fix it?

Thanks a lot.

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 随身WiFi网络灯亮但是没有网络,如何解决?
    • ¥15 gdf格式的脑电数据如何处理matlab
    • ¥20 重新写的代码替换了之后运行hbuliderx就这样了
    • ¥100 监控抖音用户作品更新可以微信公众号提醒
    • ¥15 UE5 如何可以不渲染HDRIBackdrop背景
    • ¥70 2048小游戏毕设项目
    • ¥20 mysql架构,按照姓名分表
    • ¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分
    • ¥15 delphi webbrowser组件网页下拉菜单自动选择问题
    • ¥15 linux驱动,linux应用,多线程