PHP simple_html_dom无法正确解析Apple维基百科页面

I am trying to parse a Wikipedia page - and for some reason below code works for all Wikipedia pages (except the Apple Wikipedia page!!!)

include ('simple_html_dom.php');
$url = "http://en.wikipedia.org/wiki/Apple_Inc.";

$html = file_get_html($url);

Strlen for $html above returns 0 above for Apple.

Note: the above code works perfectly fine when $url is set to other Wikipedia pages for Microsoft - http://en.wikipedia.org/wiki/Microsoft - for Diageo - http://en.wikipedia.org/wiki/Diageo, etc

I want to use file_get_html - so that i can get it into a DOM object and process it further.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dongxuying7583 2015-03-22 17:47
关注
Change MAX_FILE_SIZE constant in simple_html_dom.php to, e.g.

define('MAX_FILE_SIZE', 800000);

and you are good to go... :) This is way you got '0' in case of apple page. Strlen is above limit...

if (empty($contents) || strlen($contents) > MAX_FILE_SIZE) { return false; }
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

PHP simple_html_dom无法正确解析Apple维基百科页面 html php
2015-03-22 17:28

回答 1 已采纳 Change MAX_FILE_SIZE constant in simple_html_dom.php to, e.g. define('MAX_FILE_SIZE', 800000);
Wordpress simple_html_dom.php管理页面 php
2018-11-27 23:12

回答 2 已采纳 I was able to solve this by looking at file_get_contents(): stream does not support seeking ／ When
为什么找不到div？（simple_html_dom） html php
2017-12-23 17:50

回答 2 已采纳 So the Solution: For some unkown reason I needed to find the div/tag I was searching for by count
HTML分类面试题
2022-08-09 09:00

狡辉两门的博客语义是指对一个词或者句子含义的正确解释。很多HTML标签也具有语义的意义，也就是说元素本身传达了关于标签所包含内容类型的一些信息。例如，当浏览器解析到标签时，它将该标签解释为包含这一块内容的最重要的标题。...
使用带有ajax的simple_html_dom [重复] ajax html php
2015-02-11 13:32

回答 1 已采纳 try with this <?php require_once '../library/Simple_HTML_DOM/simple_html_dom.php'; // Create
如何在PHP中使用simple_html_dom导入多个URL？ php
2018-06-22 09:14

回答 2 已采纳 I got an answer. <?php if(!empty($_FILES["excel_file"])) { $connect = mysqli_connect("loc
在simple_html_dom中设置超时 php
2016-02-04 13:26

回答 1 已采纳 You can not do that with simple_html_dom() or file_get_contents() or any other 'pure' PHP. For th
前端文摘：深入解析浏览器的幕后工作原理
2016-10-05 08:30

Elliebababa的博客人类语言并不属于这样的语言，因此无法用常规的解析技术进行解析。解析器和词法分析器的组合　解析的过程可以分成两个子过程：词法分析和语法分析。　词法分析是将输入内容分割成大量标记的过程。标记...
将simple_html_dom解析器的结果限制为仅第一次出现 php
2013-03-18 19:39

回答 1 已采纳 1, use the second parameter of find (zero based): $element = $html->find('img',0); echo '<p
如何在PHP中通过simple_html_dom解析HTML时区分单个Div中的标记 html php
2014-11-04 16:46

回答 1 已采纳 With $text1 = "Text1, has comma, Text2" and $text2 = ", Text2" you could use substr_replace(): $t
如何使用PHP Simple HTML dom获取此文本？ html php
2015-10-30 23:20

回答 2 已采纳 Maybe this will give you the result you are looking for: foreach($info_html->find('div.info p'
前端文摘：深入解析浏览器的幕后工作原理(转)
2018-07-26 09:00

weixin_30443747的博客前端文摘：深入解析浏览器的幕后工作原理 https://www.cnblogs.com/lhb25/p/how-browsers-work.html 您可能感兴趣的相关文章 10大流行 Metro UI Bootstrap 主题和模板精选12款优秀 jQuery Ajax 分页...
simple_html_dom访问div里面的ul php
2017-03-14 10:19

回答 1 已采纳 You have to select <ul> inside $element by using $dom = $dom->find($element.' ul', 0)-&g
前端性能优化方法总结
2018-06-29 10:51

七凉可以不悲伤的博客前端性能优化（一）前端是庞大的，包括 HTML、 CSS、 Javascript、Image 、Flash等等各种各样的资源。前端优化是复杂的，针对方方面面的资源都有不同的方式。那么，前端优化的目的是什么 ?　1. 从用户角度而言，优化...
前端性能优化总结（遇到好的，就得珍惜）
2019-08-05 19:51

朱小润的博客前端性能优化（一）前端是庞大的，包括 HTML、 CSS、 Javascript、Image 、Flash等等各种各样的资源。前端优化是复杂的，针对方方面面的资源都有不同的方式。那么，前端优化的目的是什么 ? 　1. 从用户角度而言...
没有解决我的问题, 去提问

悬赏问题

¥15 抖音咸鱼付款链接转码支付宝
¥15 ubuntu22.04上安装ursim-3.15.8.106339遇到的问题
¥15 求螺旋焊缝的图像处理
¥15 blast算法（相关搜索：数据库）
¥15 请问有人会紧聚焦相关的matlab知识嘛？
¥15 网络通信安全解决方案
¥50 yalmip+Gurobi
¥20 win10修改放大文本以及缩放与布局后蓝屏无法正常进入桌面
¥15 itunes恢复数据最后一步发生错误
¥15 关于#windows#的问题：2024年5月15日的win11更新后资源管理器没有地址栏了顶部的地址栏和文件搜索都消失了

PHP simple_html_dom无法正确解析Apple维基百科页面

1条回答 默认 最新

悬赏问题

1条回答默认最新