这个html dom php代码有什么问题？

I'm trying to do a code that will print the contents of all the elements with itemprop="price" from some link but it don't work, I can't figure out why, this is the code:

<?php
error_reporting(0);
ini_set('display_errors', 0);
$doc      = new DOMDocument();
$allscan  = array(
    'http://www.mobile54.co.il/30786',
    'http://www.mobile54.co.il/35873',
    'http://www.mobile54.co.il/34722'
);
$alllinks = array();
$html     = file_get_contents($allscan[0]);
$doc->loadHTML($html);
$href = $doc->getElementsByTagName('a');
for ($j = 0; $j < count($allscan); $j++) {
    $html = file_get_contents($allscan[$j]);
    $doc->loadHTML($html);
    $href = $doc->getElementsByTagName('a');
    for ($i = 0; $i < $href->length; $i++) {
        $link = $href->item($i)->getAttribute("href");
        $lin  = preg_replace('/\s+/', '', 'http://www.mobile54.co.il' . $link . "<br />");
        if (strpos($link, 'items/') && !strpos($link, '#techDetailsAName')) {
            if (!in_array($lin, $alllinks)) {
                $alllinks[] = $lin;
            }
        }
    }
}

for ($i = 0; $i < count($alllinks); $i++) {
    echo $alllinks[$i];
}
for ($i = 0; $i < count($alllinks); $i++) {
    $lin  = "$alllinks[$i]";
    $html = file_get_contents($lin);
    $doc->loadHTML('<?xml encoding="UTF-8"?>' . $html);
    $span = $doc->getElementsByTagName('span');
    for ($j = 0; $j < $span->length; $j++) {
        $attr = $span->item($j)->getAttribute('itemprop');
        if ($attr == "price") {
            echo $span->item($j)->textContent . "<br />";
        }
    }
}


?>

when I paste "someurl" insted of $lin it work but the other way doesn't. I've tried to do $html = file_get_contents($alllinks[$i]); but it didn't work, I don't know why.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

duanran3115 2017-04-07 18:48

关注

I think your problem is probably that you appended a <br /> to the end of your URL for some reason. But, there are a lot of opportunities to improve your code with use of XPath. (Note also that you can just pass a URL directly to the DomDocument object.)

First we pull all the <a> elements with matching attribute values. We get the URLs and then search them for elements with the exactly matching itemprop attribute, and get the text content of them.

<?php
$url = "http://www.mobile54.co.il/30786";
$prices = [];
$hrefs = [];
$combined = [];

$dom = new DomDocument;
libxml_use_internal_errors(true);
$dom->loadHtmlFile($url);
$xpath = new DomXPath($dom);
// get <a> elements with href containing items/ but not #techDetailsAName
$nodes = $xpath->query("//a[contains(@href, 'items/') and not(contains(@href, '#techDetailsAName'))]/@href");
foreach ($nodes as $node) {
    $hrefs[] = trim($node->value);
}

// now you have a list of URLs
foreach ($hrefs as $k=>&$href) {
    $href = "http://www.mobile54.co.il$href";
    $dom->loadHtmlFile($href);
    $xpath = new DomXPath($dom);
    // get any element with itemprop of price
    $nodes = $xpath->query("//*[@itemprop='price']");
    $prices[$k] = $nodes->item(0)->textContent;
}

// now you have $urls and $prices, combine them:
foreach ($hrefs as $k=>$v) {
    $combined[$k] = [$hrefs[$k], $prices[$k]];
}
print_r($combined);

本回答被题主选为最佳回答 , 对您是否有帮助呢?

报告相同问题？

关注问题

这个html dom php代码有什么问题？ html php
2017-04-07 17:54

回答 1 已采纳 I think your problem is probably that you appended a <br /> to the end of your URL for some
我怎么找到这个div？（PHP Simple HTML DOM Parser） html php
2017-12-14 13:13

回答 2 已采纳 I made a change to your code where I am searching for the class: <?php include('simple_htm
PHP：Simple DOM Parser如何迭代这个html代码 php
2018-04-20 06:22

回答 1 已采纳 As an alternative, since you're targeting that ID, you don't need to have a foreach on the parent
绝不误人子弟！零基础应该选择学习Java、PHP，还是前端？
2021-05-27 13:31

沉默王二的博客真不好选，因为“男怕入错行，女怕嫁错郎”，初学者纠结这个问题也是情有可原。首先来说说 PHP，Web 蛮荒的年代，PHP 真的是王者姿态，连 Java 可能都要礼让三分，但近些年，PHP 只能做一些速成型的外包项目了，...
有没有人遇到过，vue前端 html2cancas截图不全的问题？ elementui vue.js 前端
2022-05-10 14:29

回答 2 已采纳这个就是截显示在当前页面的部分的内容吧，滚动滚动条再出现的就截不到了，只是给当前网页页面截图
vue3 v-html 插入dom 正确的写法请教？ javascript vue.js 前端有问必答
2022-05-23 14:23

回答 6 已采纳你可以了解下动态组件 components is 循环一下组件名这种，应该是能满足你的需求的 v-html并不能写自定义组件楼上给出答案了
PHP XML DOM：为什么我的大型HTML文件被截断？ html php xml
2017-02-21 18:16

回答 1 已采纳 I think it is because you were meaning to use loadHTMLFile( $filename ) not loadHTML( $html ). loa
HTML5开发和web前端开发有什么区别
2022-09-08 18:41

xiaoweids的博客 htmL5是htmL标准，它本身是一项标准化协议，然而被炒热后登上了互联网行业的招聘条目，htmL5开发这个名词本身就是不专业的称呼，通常意义上指使用htmL5等较为潮流的技术进行前端开发。完成客户端程序（也就是浏览器...
PHP简单的HTML DOM解析器“字符问题 html php
2015-09-08 08:45

回答 1 已采纳 If i escape the characters, i lose them. But you can use addslashes() method for removing them. H
如何使用PHP Simple HTML dom获取此文本？ html php
2015-10-30 23:20

回答 2 已采纳 Maybe this will give you the result you are looking for: foreach($info_html->find('div.info p'
dom是一段html代码，jq可以$(dom) 获取到这个对象，原生怎么获取？ html html5 javascript
2022-04-12 10:18

回答 2 已采纳通过ID获取 <div id="box"></div> document.getElementById('box') 通过类名 <div class="box">
GitHub 上有什么好玩又有挑战的前端项目？
2022-09-09 22:47

「已注销」的博客今天推荐一些免费好玩又有挑战的前端项目，难度层层递进，内容也很有趣，以游戏和小工具为主。如果你还没有前端基础，推荐从这 3 门课开始：基础阶段 1. HTML5 简明教程 HTML5 简明教程 ...
如何用PHP语言从HTML文件创建DOM对象？ html php
2016-11-04 15:33

回答 1 已采纳 You can get the HTML code using the Snoopy Class (https://sourceforge.net/projects/snoopy). Next c
SEO是什么？前端如何进行SEO优化
2021-11-13 22:54

万物之恋的博客前端如何进行SEO优化 SEO是什么？ seo又称网站优化，也称搜索引擎优化，英文名（Search Engine Optimization），简称：seo。 seo是一种基础搜索引擎的网络营销推广方式，通过搜索引擎平台的规则来优化，以实现产品...
php html代码解析,用php解析html的实现代码
2021-06-13 12:00

姑苏薛衡芜的博客最近想用php写一个爬虫，就需要解析html，在sourceforge上找到一个项目叫做PHP Simple HTML ...首先要在程序的开始引入simple_html_dom.php这个文件复制代码代码如下:include_once('simple_html_dom.php');PHP Sim...
没有解决我的问题, 去提问

悬赏问题

¥100 set_link_state
¥15 虚幻5 UE美术毛发渲染
¥15 CVRP 图论物流运输优化
¥15 Tableau online 嵌入ppt失败
¥100 支付宝网页转账系统不识别账号
¥15 基于单片机的靶位控制系统
¥15 真我手机蓝牙传输进度消息被关闭了，怎么打开？(关键词-消息通知)
¥15 装 pytorch 的时候出了好多问题，遇到这种情况怎么处理？
¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
¥15 手机接入宽带网线，如何释放宽带全部速度

码龄粉丝数原力等级 --

这个html dom php代码有什么问题？

1条回答默认最新

码龄粉丝数原力等级 --

悬赏问题

这个html dom php代码有什么问题？

1条回答 默认 最新

悬赏问题

1条回答默认最新