在PHP中使用XPath获取href属性

I am new to PHP and trying to write a scrapper for a website.

I am trying to get an element with class name categories. I have use

$showPage = '<li class="categories">Categories<ul>  <li class="cat-item cat-item-940"><a href="http://www.desitvbox.me/category/star-plus/amul-taste-of-india/" >Amul Taste of India</a>
</li>
    <li class="cat-item cat-item-942"><a href="http://www.desitvbox.me/category/star-plus/dance-plus/" >Dance Plus</a>
</li>
    <li class="cat-item cat-item-239"><a href="http://www.desitvbox.me/category/star-plus/diya-aur-baati-hum-star/" >Diya Aur Baati Hum</a>
</li>
    <li class="cat-item cat-item-745"><a href="http://www.desitvbox.me/category/star-plus/suhani-si-ek-ladki/" >Suhani Si Ek Ladki</a>
</li>
    <li class="cat-item cat-item-147"><a href="http://www.desitvbox.me/category/star-plus/star-plus-completed-shows/" >Star Plus Completed Shows</a>
<ul class="children">
    <li class="cat-item cat-item-772"><a href="http://www.desitvbox.me/category/star-plus/star-plus-completed-shows/airlines/" >Airlines</a>
</li>
    <li class="cat-item cat-item-518"><a href="http://www.desitvbox.me/category/star-plus/star-plus-completed-shows/arjun/" >Arjun</a>
</li>
    <li class="cat-item cat-item-237"><a href="http://www.desitvbox.me/category/star-plus/star-plus-completed-shows/chef-pankaj-ka-zayka/" >Chef Pankaj Ka Zayka</a>
</li>
</ul>
</li>
</ul></li>';   
$dom = new DOMDocument();
$dom->validateOnParse = true;
$dom->loadHTML($showPage);  
$dom->preserveWhiteSpace = false;

$allShowsList = new DOMXPath($dom);
$allShowsTableHTML = $allShowsList->query('//li[contains(@class, "categories")]');

However, I want to now read the values of all a href mentioned in $allShowsTableHTML.

Can you please advise how can I do that?

As you can see one the record also have ul class = 'childern'. which I also want to read.

I need to get the href and the title.

I have tried below but no result.

$allShowTableDom = new DOMDocument();
foreach ($allShowTableHTML as $showLink)
{
    $allShowTableDom->appendChild($allShowTableDom->importNode($showLink,true));
} 
$showsArray = $allShowsTableHTML->getElementsByTagName('a');

I think it is not going in foreach loop.

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dsaaqdz6223 2015-06-06 09:58
关注
To get all href attributes of the hyperlinks, add some more axis steps, finally loop over the result list, where the ->value property will contain the URIs.

Given you can just dump all href attributes inside the whole <li> element, simply extend your query by //a/@href:

$document = new DOMXPath($dom); $hrefs = $document->query('//li[contains(@class, "categories")]//a/@href'); foreach ($hrefs as $href) { echo $href->value; }

If this contains nodes you don't want to get, you could also descend the contain unsorted list and select with a more specific query:

//li[contains(@class, "categories")]/ul/li/a/@href
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

php domxpath,php – DOMXpath – 获取一个元素的href属性和文本值
2021-04-12 18:10

数码反欺诈联盟的博客所以我有一个这样的HTML字符串：Some NameSome Name2使用XPath我可以使用此Xpath查询获取href属性的值：$domXpath = new \DOMXPath($this->domPage);$hrefs = $domXpath->query("//td[@class='name']/a/@href...
php xpath类库,PHP Xpath：获取包含针的所有href值
2021-03-24 11:47

weixin_39787397的博客不确定我正确地理解了这个问题,但是第二个XPath表达式已经做了你所描述的内容.它与A元素的文本节点不匹配,但href属性：$html = <<< HTMLDescriptionDescriptionHTML;...xpath("//a[contains(@href,'foo...
php 取网页的a标签,php xpath 获取网页a链接 href值
2021-03-24 10:52

weixin_39627390的博客 $lists = $xpath->query('.//div[contains(@class, "wrap-img")]/a/@href'); foreach ($lists as $node) { $detail_url = $node->nodeValue; Log::info('detail_url', [$detail_url]); } }
php xpath类库,PHP – DOMXpath – 获取结果
2021-03-24 11:47

大机灵聪明绝顶的博客当我想用XPath打印evaluate表达式的结果时,我有错误.$url = $xpath-> evaluate(‘// a / @ href’,$event);echo $url;我有这个错误：可捕获的致命错误：类DOMNodeList的对象无法转换为字符串我的代码：// Get the...
xpath获取html属性值,利用XPath高效提取Html中的数据
2021-06-12 04:29

weixin_39621870的博客 XPath基于XML的树状结构，有不同类型的节点，包括元素节点，属性节点和文本节点，提供在数据结构树中找寻节点的能力。利用好XPath，能够极高地提升在Html层叠样式中获取数据的效率。俗话说得好，工欲上其事，必先利...
php之xpath类,关于php：使用xpath选择CSS类
2021-04-22 08:37

智圈知识产权的博客我只想自行选择一个名为.date的类由于某种原因，我无法使它正常工作。如果有人知道我的代码有什么问题，将不胜感激。@$doc = new DOMDocument();... // just to make xpath more simple$images = $xml->xpa...
php xpath类库,PHP 怎么使用 XPath 来采集页面数据内容
2021-03-24 11:47

快乐小学僧的博客之前有说过使用 Python 使用 XPath 去采集页面数据内容，前段时间参与百度内测的一个号主页展现接口，需要文章页面改造的application/ld+json代码我想过使用 QueryList 的框架去操作，但是因为他大小也算个框架，...
php 获取锚文本,Xpath表达式获取href。不只是锚文本 - php
2021-04-09 10:13

jiyulishang的博客尝试使用xpath表达式来学习它。我找到了一个代码段，并对其进行了一些调整。我想做的是获取页面上的每个链接。$baseurl = "http://www.example.com";$html = file_get_contents($baseurl);$dom = new DOMDocument();...
php判断给href不同的值,php – DOMXpath – 获取一个元素的href属性和文本值
2021-04-19 06:12

weixin_39613637的博客所以我有一个这样的HTML字符串：Some NameSome Name2使用XPath我可以使用此Xpath查询获取href属性的值：$domXpath = new \DOMXPath($this->domPage);$hrefs = $domXpath->query("//td[@class='name']/a/@href...
php使用xpath解析html
2019-02-12 19:07

benben0729的博客实例1 $xml = simplexml_load_file('... $names = $xml->xpath("html/body/div/div/form/div/div/div/div/div[*]/div/div/table//tr/td[@class='topicViews']"); foreach($names as...
没有解决我的问题, 去提问

在PHP中使用XPath获取href属性

1条回答 默认 最新

1条回答默认最新