dongyan3616 2013-07-09 12:48
浏览 59
已采纳

php domdocument loadHTML和getElementsByTagName什么都不返回

$urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd";
$pageContentData = file_get_contents($urlToScrap);
$doc = new DOMDocument();
$doc->loadHTML($pageContentData);
$listOfDivs = $doc->getElementsByTagName("div");
foreach ($listOfDivs as $div) {
    if($div->getAttribute("class") == "doc-banner-icon"){
        $img = $div->getElementsByTagName("img");
        var_dump($img->getAttribute("src"));
    }
}

returns empty.

I have the following elements in the dom:

<div class="doc-banner-icon"><img src="somesrc"></div>

I'm trying to get the img src and since in the page there are many images, I would like to first get the parent div and then extract the image inside it.

The solution is here:

$urlToScrap = "https://play.google.com/store/apps/details?id=flipboard.app#?t=W251bGwsMSwxLDIxMiwiZmxpcGJvYXJkLmFwcCJd";
$pageContentData = file_get_contents($urlToScrap);
$doc = new DOMDocument();
$doc->loadHTML($pageContentData);
$listOfDivs = $doc->getElementsByTagName("div");
foreach ($listOfDivs as $div) {
    if($div->getAttribute("class") == "doc-banner-icon"){
        $listOfImages = $div->getElementsByTagName("img");
        foreach($listOfImages as $img){
            var_dump($img->getAttribute("src"));
        }
    }
}
  • 写回答

1条回答 默认 最新

  • donglue8180 2013-07-09 12:57
    关注

    You aren't missing anything, var_dump doesn't work as you expect on a DOMNodeList. Try this instead:

    $listOfImages = $doc->getElementsByTagName("img");
    
    foreach ($listOfImages as $img) {
        $imgClass = $img->getAttribute('class');
    
        echo $imgClass;
    }
    

    In your updated question, just change:

    $img->getAttribute("src")
    

    to:

    $img->item(0)->getAttribute("src")
    

    Given that your selection criteria is fairly complex, you might consider using XPath instead of navigating manually:

    $doc = new DOMDocument();
    $doc->loadHTML($pageContentData);
    
    $xpath = new DOMXPath($doc);
    $img = $xpath->query("//div[@class = 'doc-banner-icon']/img");
    
    var_dump($img->item(0)->getAttribute('src'));
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器