doumao1917 2018-07-15 20:17
浏览 77
已采纳

简单的HTML DOM无法获取文件

I have no clue what the solution might be. I simply cannot get the html file of this Charizard, I don't get any response even though the link is correct. Bulbasaur is working fine, but I want this lovely Charizard...

include("simple_html_dom.php");
$html = file_get_html('https://bulbapedia.bulbagarden.net/wiki/Charizard_(Pok%C3%A9mon)');
$html2 = file_get_html('https://bulbapedia.bulbagarden.net/wiki/Bulbasaur_(Pok%C3%A9mon)');
echo $html;
echo $html2;

Does this page have any protection or is Charizard only harder to catch? I'd appreciate if you are able to help me with this.

Jonas :)

  • 写回答

3条回答 默认 最新

  • dongshanfan1941 2018-07-16 03:45
    关注

    There are two problems here:

    1. Length of the content fetched from this URL exceeds MAX_FILE_SIZE (defined in simple_html_dom.php)
    2. The bug that was pointed out in the comments (https://github.com/sunra/php-simple-html-dom-parser/issues/37). This bug seems to be resolved in the forked repository that is maintained on github but it still exists in original version (which does not seem to be maintained anymore).

    To solve the first problem, edit simple_html_dom.php and change define('MAX_FILE_SIZE', 600000); to use a bigger number.

    As a workaround for the second problem, pass correct parameters to file_get_html, and by that I mean to pass 0 for $offset:

    $html = file_get_html('https://bulbapedia.bulbagarden.net/wiki/Charizard_(Pok%C3%A9mon)',
    false,
    null,
    0); // this last one is the offset
    
    var_dump($html);
    

    Alternatively you can use the forked version of the library.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥30 这是哪个作者做的宝宝起名网站
  • ¥60 版本过低apk如何修改可以兼容新的安卓系统
  • ¥25 由IPR导致的DRIVER_POWER_STATE_FAILURE蓝屏
  • ¥50 有数据,怎么建立模型求影响全要素生产率的因素
  • ¥50 有数据,怎么用matlab求全要素生产率
  • ¥15 TI的insta-spin例程
  • ¥15 完成下列问题完成下列问题
  • ¥15 C#算法问题, 不知道怎么处理这个数据的转换
  • ¥15 YoloV5 第三方库的版本对照问题
  • ¥15 请完成下列相关问题!