dongtan5811 2014-07-22 21:40
浏览 33
已采纳

从页面获取特定元素

I'm trying to pull some data from my website. It is pretty simple, but I can't find any good examples/docs, so I am having a tough time. I'm trying to make an API for my friends to use my blog, but it's a bit difficult. Let's assume I have a website at http://www.sample.com, and the html source for that website is:

  <div class="container">
   <a href="/mywebsiteblogpost/">
      <h2 class="title">im the best</h2>
   </a>
   <span class="author">Josue Espinosa</span> 
   <div class="thumb"> <img src="http://www.sample.com/imgsrc" alt="">
   <span class="category">sports</span> 
   </div>
   <p>preview text</p>
   <a class="more" href="/mywebsiteblogpost/">full text...</a> 
</div>

I want to get all of .container's children, the first a child's href value, the text value of the class title, author, the img src for the child inside .thumb, and the text value for category.

I started with the a href src, but I didn't even get that far. I thought $title would be echoing the href value of the first anchor tag inside of container, but it doesn't work.

$text = file_get_contents('http://www.sample.com');
$doc = new DOMDocument('1.0');
$doc->loadHTML($text);
foreach($doc->getElementsByTagName('div') AS $div) {
    $class = $div->getAttribute('class');
    if(strpos($class, 'container') !== FALSE) {
        // title doesnt retrieve the href value of title :(
        $title = 'TITLE'.$div->getElementsByTagName('a')->getAttribute('href').'<br>';
        //this echos all the text in all of the children of $div
        echo $div->textContent.'<br>';
    }
}

Can anyone explain why please?

  • 写回答

3条回答 默认 最新

  • doukangbin9698 2014-07-22 21:49
    关注

    The culprit is $div->getElementsByTagName('a')->getAttribute('href'). The first part, $div->getElementsByTagName('a') retrieves a list of elements, not a single element. So the following ->getAttribute('href') will not do the right thing.

    To fix this, iterate just as you do with the div-tags:

    foreach($div->getElementsByTagName('a') as $a) {
      $href = $a->getAttribute('href');
      if ($href) echo "TITLE$href<br>";
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 HFSS 中的 H 场图与 MATLAB 中绘制的 B1 场 部分对应不上
  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?