doubengshao8872 2014-07-29 14:47
浏览 26
已采纳

XPath - 在一些文本后选择锚点

I'd like to fetch data from this sample code:

<div id="text">

(sd) <a href="http://example.com/somefiledfs.flv">http://example.com/somefiledfs.flv</a>
 - 380 kbps 
 - <a href='/player.swf?config={"clip":{"url":"http://example.com/somefiledfs.flv"}'>Watch</a><br>

(576p) <a href="http://example.com/hgyj.mp4">http://example.com/hgyj.mp4</a>
 - 780 kbps 
 - <a href='/player.swf?config={"clip":{"url":"http://example.com/hgyj.mp4"}'>Watch</a><br>

</div>

I'd like to get it as:

sd - http://example.com/somefiledfs.flv

576p - http://example.com/hgyj.mp4

and so on.

Could sb help? I've beed trying to use "//div[@id='text']/a" and ancestor/preceding but I can't work it out.

  • 写回答

1条回答 默认 最新

  • doujing5150 2014-07-29 19:40
    关注

    Here's a working PHP snippet, basically loop over all links then check the previous node if it matches sd|576p (extend more formats here if needed...)

    <?php 
    $html = <<<HTML
    <div id="text">
      (sd) <a href="http://example.com/somefiledfs.flv">http://example.com/somefiledfs.flv</a>   
        - 380 kbps 
        - <a href='/player.swf?config={"clip":{"url":"http://example.com/somefiledfs.flv"}'>Watch</a><br>
    
      (576p) <a href="http://example.com/hgyj.mp4">http://example.com/hgyj.mp4</a>
        - 780 kbps 
        - <a href='/player.swf?config={"clip":{"url":"http://example.com/hgyj.mp4"}'>Watch</a><br>
    
    </div>
    HTML;
    
    $dom = new DOMDocument();
    $dom->loadHTML($html);
    $xpath = new DOMXPath($dom);
    
    $as = $xpath->query("//div[@id='text']/a");
    
    foreach ($as as $a) {
      $prev = $a->previousSibling->nodeValue;
    
      if (preg_match("/sd|576p/", $prev, $matches)) {
        echo $matches[0]." - ".$a->nodeValue."
    ";
      }
    }
    ?>
    

    here's a link to the snippet: https://eval.in/173038

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 python天天向上类似问题,但没有清零
  • ¥30 3天&7天&&15天&销量如何统计同一行
  • ¥30 帮我写一段可以读取LD2450数据并计算距离的Arduino代码
  • ¥15 C#调用python代码(python带有库)
  • ¥15 矩阵加法的规则是两个矩阵中对应位置的数的绝对值进行加和
  • ¥15 活动选择题。最多可以参加几个项目?
  • ¥15 飞机曲面部件如机翼,壁板等具体的孔位模型
  • ¥15 vs2019中数据导出问题
  • ¥20 云服务Linux系统TCP-MSS值修改?
  • ¥20 关于#单片机#的问题:项目:使用模拟iic与ov2640通讯环境:F407问题:读取的ID号总是0xff,自己调了调发现在读从机数据时,SDA线上并未有信号变化(语言-c语言)