dtry54612 2015-03-22 17:28
浏览 57
已采纳

PHP simple_html_dom无法正确解析Apple维基百科页面

I am trying to parse a Wikipedia page - and for some reason below code works for all Wikipedia pages (except the Apple Wikipedia page!!!)

include ('simple_html_dom.php');
$url = "http://en.wikipedia.org/wiki/Apple_Inc.";

$html = file_get_html($url);

Strlen for $html above returns 0 above for Apple.

Note: the above code works perfectly fine when $url is set to other Wikipedia pages for Microsoft - http://en.wikipedia.org/wiki/Microsoft - for Diageo - http://en.wikipedia.org/wiki/Diageo, etc

I want to use file_get_html - so that i can get it into a DOM object and process it further.

  • 写回答

1条回答 默认 最新

  • dongxuying7583 2015-03-22 17:47
    关注

    Change MAX_FILE_SIZE constant in simple_html_dom.php to, e.g.

    define('MAX_FILE_SIZE', 800000);
    

    and you are good to go... :) This is way you got '0' in case of apple page. Strlen is above limit...

    if (empty($contents) || strlen($contents) > MAX_FILE_SIZE)
    {
        return false;
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
  • ¥15 随身WiFi网络灯亮但是没有网络,如何解决?
  • ¥15 gdf格式的脑电数据如何处理matlab
  • ¥20 重新写的代码替换了之后运行hbuliderx就这样了
  • ¥100 监控抖音用户作品更新可以微信公众号提醒
  • ¥15 UE5 如何可以不渲染HDRIBackdrop背景
  • ¥70 2048小游戏毕设项目
  • ¥20 mysql架构,按照姓名分表
  • ¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分
  • ¥15 delphi webbrowser组件网页下拉菜单自动选择问题