普通网友 2014-09-14 20:27
浏览 50
已采纳

无法使用php dom解析器解析特定的链接

I´m parsing some itunes links with dom parser in php. With most of the links it works perfectly. Others which are totally the same type it doesn`t?! I need the "img" tag and the "src-swap-high-dpi" attribute. It drives me nuts. That´s a part of my php-code

$url = "https://itunes.apple.com/us/podcast/id278981407";
$htmlContent = str_get_html(file_get_contents($url));

foreach ($htmlContent->find("img") as $element) {
$value  = $element->getAttribute("src-swap-high-dpi");
echo $value;
}

So e.g. I can parse the following links: https://itunes.apple.com/us/podcast/id201671138

https://itunes.apple.com/us/podcast/id523121474

https://itunes.apple.com/us/podcast/id152249110

But this e.g. not:

https://itunes.apple.com/us/podcast/id278981407

I do not get any output.

Edit:

New Code doesnt work as well:

Still not working for me. Very strange. Thats my new complete code now:

 <?php
 ini_set("display_errors",1); error_reporting(E_ALL);
 require_once ('simple_html_dom.php');

 $url = "https://itunes.apple.com/us/podcast/id278981407";

 $htmlContent = str_get_html(file_get_contents($url));


foreach($htmlContent->find("div.artwork") as $div) {
 $value = $div->find("img",0)->getAttribute("src-swap-high-dpi");
 echo $value."<br/>";
 }

?>

I get the Output:

Fatal error: Call to a member function find() on a non-object in /home/www/whatever/delete.php on line 10

line 10 is the line starting with "foreach". Your code works fine with the links provided above which I declared as working. But as soon as I take one of the designated one which doesnt work I get the error message provided above. ?!

展开全部

  • 写回答

1条回答 默认 最新

  • dtrt2368 2014-09-14 21:08
    关注

    I think this is one of the cases Simple DOM gets a bit confused and you need to provide it with a parent:

    $url = "https://itunes.apple.com/us/podcast/id278981407";
    $htmlContent = str_get_html(file_get_contents($url));
    foreach($htmlContent->find("div.artwork") as $div) {
       $value = $div->find("img",0)->getAttribute("src-swap-high-dpi");
       echo $value."<br/>";
    }
    

    UPDATE

    Here are the results using the above fragment:

    http://a3.mzstatic.com/us/r30/Podcasts/v4/61/cc/7f/61cc7f25-131f-7616-6549-5553e6444b87/mza_7489225285918350214.150x150-75.jpg
    http://a2.mzstatic.com/us/r30/Podcasts6/v4/04/a9/64/04a964d7-7c10-72d6-871b-97619cf89066/mza_1416781107029663068.150x150-75.jpg
    http://a5.mzstatic.com/us/r30/Podcasts4/v4/bb/a6/f4/bba6f4b6-eeab-d7d9-8591-adb2bd277ccb/mza_5223368352447971673.150x150-75.jpg
    http://a1.mzstatic.com/us/r30/Podcasts5/v4/aa/54/16/aa541600-cc8b-772b-9c0a-824efe8fdc42/mza_6772270613386652594.150x150-75.jpg
    http://a2.mzstatic.com/us/r30/Podcasts3/v4/95/3d/2f/953d2f75-c2c2-4815-a752-f30fdcc0b9fb/mza_9037746738018570312.150x150-75.jpg
    http://a4.mzstatic.com/us/r30/Podcasts4/v4/a2/1c/f5/a21cf5a4-2d8d-1ed7-983f-1c90f2f4f948/mza_7120473049241631392.340x340-75.jpg
    http://a2.mzstatic.com/us/r30/Podcasts4/v4/5d/21/8d/5d218d2a-2980-0ac9-0bc7-9321ea6eb334/mza_6358466742996313573.150x150-75.jpg
    http://a1.mzstatic.com/us/r30/Podcasts/b2/bb/bf/ps.ykmejwzs.150x150-75.jpg
    http://a4.mzstatic.com/us/r30/Podcasts6/v4/17/ea/31/17ea3187-ef8c-4756-e488-0c65adced988/mza_7931750363714403933.150x150-75.jpg
    http://a1.mzstatic.com/us/r30/Podcasts2/v4/0b/3c/7d/0b3c7d2b-19bf-f7a2-7c50-ca15338b8316/mza_2792239161425784587.150x150-75.jpg
    

    Can you verify you're not getting errors at all ? Say, just write some weird characters in your PHP file, does the PHP shows the error? If not, try to add this in your .htaccess file.

    <IfModule mod_php5.c>
       # do not display errors
       php_value display_errors 1
    </IfModule>
    

    UPDATE 2

    $url = "https://itunes.apple.com/us/podcast/id278981407";
    
    $ch = curl_init();
    curl_setopt($ch,CURLOPT_URL,$url);
    curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
    curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,FALSE);
    $html = curl_exec($ch);
    curl_close($ch);
    
    //$htmlContent = str_get_html(file_get_contents($url));
    $htmlContent = str_get_html($html);
    foreach($htmlContent->find("div.artwork") as $div) {
       $value = $div->find("img",0)->getAttribute("src-swap-high-dpi");
       echo $value."<br/>";
    }
    

    The reason i didn't use file_get_html of Simple Dom is because it simply uses file_get_contents internally.

    展开全部

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
编辑
预览

报告相同问题?

悬赏问题

  • ¥15 PADS Logic 原理图
  • ¥15 PADS Logic 图标
  • ¥15 电脑和power bi环境都是英文如何将日期层次结构转换成英文
  • ¥20 气象站点数据求取中~
  • ¥15 如何获取APP内弹出的网址链接
  • ¥15 wifi 图标不见了 不知道怎么办 上不了网 变成小地球了
手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部