PHP simple_html_dom无法正确解析Apple维基百科页面

I am trying to parse a Wikipedia page - and for some reason below code works for all Wikipedia pages (except the Apple Wikipedia page!!!)

include ('simple_html_dom.php');
$url = "http://en.wikipedia.org/wiki/Apple_Inc.";

$html = file_get_html($url);

Strlen for $html above returns 0 above for Apple.

Note: the above code works perfectly fine when $url is set to other Wikipedia pages for Microsoft - http://en.wikipedia.org/wiki/Microsoft - for Diageo - http://en.wikipedia.org/wiki/Diageo, etc

I want to use file_get_html - so that i can get it into a DOM object and process it further.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dongxuying7583 2015-03-22 17:47
关注
Change MAX_FILE_SIZE constant in simple_html_dom.php to, e.g.

define('MAX_FILE_SIZE', 800000);

and you are good to go... :) This is way you got '0' in case of apple page. Strlen is above limit...

if (empty($contents) || strlen($contents) > MAX_FILE_SIZE) { return false; }
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

PHP simple_html_dom无法正确解析Apple维基百科页面 html php
2015-03-22 17:28

回答 1 已采纳 Change MAX_FILE_SIZE constant in simple_html_dom.php to, e.g. define('MAX_FILE_SIZE', 800000);
Wordpress simple_html_dom.php管理页面 php
2018-11-27 23:12

回答 2 已采纳 I was able to solve this by looking at file_get_contents(): stream does not support seeking ／ When
为什么找不到div？（simple_html_dom） html php
2017-12-23 17:50

回答 2 已采纳 So the Solution: For some unkown reason I needed to find the div/tag I was searching for by count
HTML分类面试题
2022-08-09 09:00

狡辉两门的博客语义是指对一个词或者句子含义的正确解释。很多HTML标签也具有语义的意义，也就是说元素本身传达了关于标签所包含内容类型的一些信息。例如，当浏览器解析到标签时，它将该标签解释为包含这一块内容的最重要的标题。...
使用带有ajax的simple_html_dom [重复] ajax html php
2015-02-11 13:32

回答 1 已采纳 try with this <?php require_once '../library/Simple_HTML_DOM/simple_html_dom.php'; // Create
如何在PHP中使用simple_html_dom导入多个URL？ php
2018-06-22 09:14

回答 2 已采纳 I got an answer. <?php if(!empty($_FILES["excel_file"])) { $connect = mysqli_connect("loc
在simple_html_dom中设置超时 php
2016-02-04 13:26

回答 1 已采纳 You can not do that with simple_html_dom() or file_get_contents() or any other 'pure' PHP. For th
前端文摘：深入解析浏览器的幕后工作原理
2016-10-05 08:30

Elliebababa的博客人类语言并不属于这样的语言，因此无法用常规的解析技术进行解析。解析器和词法分析器的组合　解析的过程可以分成两个子过程：词法分析和语法分析。　词法分析是将输入内容分割成大量标记的过程。标记...
将simple_html_dom解析器的结果限制为仅第一次出现 php
2013-03-18 19:39

回答 1 已采纳 1, use the second parameter of find (zero based): $element = $html->find('img',0); echo '<p
如何在PHP中通过simple_html_dom解析HTML时区分单个Div中的标记 html php
2014-11-04 16:46

回答 1 已采纳 With $text1 = "Text1, has comma, Text2" and $text2 = ", Text2" you could use substr_replace(): $t
如何使用PHP Simple HTML dom获取此文本？ html php
2015-10-30 23:20

回答 2 已采纳 Maybe this will give you the result you are looking for: foreach($info_html->find('div.info p'
前端文摘：深入解析浏览器的幕后工作原理(转)
2018-07-26 09:00

weixin_30443747的博客前端文摘：深入解析浏览器的幕后工作原理 https://www.cnblogs.com/lhb25/p/how-browsers-work.html 您可能感兴趣的相关文章 10大流行 Metro UI Bootstrap 主题和模板精选12款优秀 jQuery Ajax 分页...
simple_html_dom访问div里面的ul php
2017-03-14 10:19

回答 1 已采纳 You have to select <ul> inside $element by using $dom = $dom->find($element.' ul', 0)-&g
前端性能优化方法总结
2018-06-29 10:51

七凉可以不悲伤的博客前端性能优化（一）前端是庞大的，包括 HTML、 CSS、 Javascript、Image 、Flash等等各种各样的资源。前端优化是复杂的，针对方方面面的资源都有不同的方式。那么，前端优化的目的是什么 ?　1. 从用户角度而言，优化...
前端性能优化总结（遇到好的，就得珍惜）
2019-08-05 19:51

朱小润的博客前端性能优化（一）前端是庞大的，包括 HTML、 CSS、 Javascript、Image 、Flash等等各种各样的资源。前端优化是复杂的，针对方方面面的资源都有不同的方式。那么，前端优化的目的是什么 ? 　1. 从用户角度而言...
没有解决我的问题, 去提问

悬赏问题

¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
¥15 随身WiFi网络灯亮但是没有网络，如何解决？
¥15 gdf格式的脑电数据如何处理matlab
¥20 重新写的代码替换了之后运行hbuliderx就这样了
¥100 监控抖音用户作品更新可以微信公众号提醒
¥15 UE5 如何可以不渲染HDRIBackdrop背景
¥70 2048小游戏毕设项目
¥20 mysql架构，按照姓名分表
¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分
¥15 delphi webbrowser组件网页下拉菜单自动选择问题

PHP simple_html_dom无法正确解析Apple维基百科页面

1条回答 默认 最新

悬赏问题

1条回答默认最新