doujiao2443 2015-06-06 09:23
浏览 135


I am new to PHP and trying to write a scrapper for a website.

I am trying to get an element with class name categories. I have use

$showPage = '<li class="categories">Categories<ul>  <li class="cat-item cat-item-940"><a href="" >Amul Taste of India</a>
    <li class="cat-item cat-item-942"><a href="" >Dance Plus</a>
    <li class="cat-item cat-item-239"><a href="" >Diya Aur Baati Hum</a>
    <li class="cat-item cat-item-745"><a href="" >Suhani Si Ek Ladki</a>
    <li class="cat-item cat-item-147"><a href="" >Star Plus Completed Shows</a>
<ul class="children">
    <li class="cat-item cat-item-772"><a href="" >Airlines</a>
    <li class="cat-item cat-item-518"><a href="" >Arjun</a>
    <li class="cat-item cat-item-237"><a href="" >Chef Pankaj Ka Zayka</a>
$dom = new DOMDocument();
$dom->validateOnParse = true;
$dom->preserveWhiteSpace = false;

$allShowsList = new DOMXPath($dom);
$allShowsTableHTML = $allShowsList->query('//li[contains(@class, "categories")]'); 

However, I want to now read the values of all a href mentioned in $allShowsTableHTML.

Can you please advise how can I do that?

As you can see one the record also have ul class = 'childern'. which I also want to read.

I need to get the href and the title.

I have tried below but no result.

$allShowTableDom = new DOMDocument();
foreach ($allShowTableHTML as $showLink)
$showsArray = $allShowsTableHTML->getElementsByTagName('a');

I think it is not going in foreach loop.

  • 写回答

1条回答 默认 最新

  • dsaaqdz6223 2015-06-06 09:58

    To get all href attributes of the hyperlinks, add some more axis steps, finally loop over the result list, where the ->value property will contain the URIs.

    Given you can just dump all href attributes inside the whole <li> element, simply extend your query by //a/@href:

    $document = new DOMXPath($dom);
    $hrefs = $document->query('//li[contains(@class, "categories")]//a/@href'); 
    foreach ($hrefs as $href) {
      echo $href->value;

    If this contains nodes you don't want to get, you could also descend the contain unsorted list and select with a more specific query:

    //li[contains(@class, "categories")]/ul/li/a/@href
    本回答被题主选为最佳回答 , 对您是否有帮助呢?



  • ¥15 请阅读下面代码,帮我修改下代码
  • ¥15 关于#microsoft#的问题:电脑启动后不显示桌面图标和窗口,除任务栏外无法操作任何东西
  • ¥15 如何输入百度,显示本地下载的html文件页面,地址栏还显示百度的地址
  • ¥15 通过kinect制作换装程序但是服装不贴合(标签-ar)
  • ¥20 matlab如何绘制三维瀑布图
  • ¥15 关于用abap来解决动态规划的问题,但是要求输出索引值,这个是难点
  • ¥15 在ISIS中什么是IP从地址
  • ¥15 压测时,并发量过高时,响应时间出现尖刺
  • ¥15 关于vmprotect3.8.4虚拟文件一项
  • ¥15 在不用IT调试的情况下怎样能连外网