duanju8308 2014-11-30 05:43
浏览 27
已采纳

解析锚标记的html文档

say i have

<a href="www.myurl/point.html" class="l" style="color:#436DBA;" onclick="return rs(this,'8 Stunning Linguistic Miracles of The Holy Quran | Kinetic Typography 144p (Video Only).mp4');">&raquo; Download MP4 &laquo;</a> - <b>144p (Video Only)</b> - <span> 19.1</span> MB<br />

html page like this i wanna parse it with simple dom php parser and i wanna get download mp4 114p 19.1 as out put while i tried this code

foreach($displaybody->find('a ') as $element) {
       // echo $element->innertext . '<br/>';

it returned me download mp4 only how do i parse remaining values download mp4 114p 19.1 please help me out

  • 写回答

1条回答 默认 最新

  • dsgdhtr_43654 2014-11-30 05:48
    关注

    You can't use the <a> tag anymore since some of the text you're trying to access isn't inside it anymore, target the document itself and then use ->plaintext:

    $html = <<<EOT
    <a href="www.myurl/point.html" class="l" style="color:#436DBA;" onclick="return rs(this,'8 Stunning Linguistic Miracles of The Holy Quran | Kinetic Typography 144p (Video Only).mp4');">&raquo; Download MP4 &laquo;</a> - <b>144p (Video Only)</b> - <span> 19.1</span> MB<br />
    EOT;
    
    $displaybody = str_get_html($html);
    echo $displaybody->plaintext;
    

    Here is another way of accessing each row thru DOMDocument with xpath:

    // load the sites html page in DOMDocument
    $dom = new DOMDocument();
    libxml_use_internal_errors(true);
    $html_page = file_get_contents('http://www.mohammediatechnologies.in/download/downloadtest.php?name=8KPEiGqDQHg');
    $dom->loadHTML(mb_convert_encoding($html_page, 'HTML-ENTITIES', 'UTF-8'));
    libxml_clear_errors();
    $xpath = new DOMXpath($dom);
    
    $data = array();
    // target elements which is inside an anchor and a line break (treat them as each row)
    $links = $xpath->query('//*[following-sibling::a and preceding-sibling::br]');
    
    $temp = '';
    foreach($links as $link) { // for each rows of the link
    
        $temp .= $link->textContent . ' '; // get all text contents
    
        if($link->tagName == 'br') {
            $unit = $xpath->evaluate('string(./preceding-sibling::text()[1])', $link);
            $data[] = $temp . $unit; // push them inside an array
            $temp = '';
        }
    }
    
    echo '<pre>';
    print_r($data);
    

    Sample Output

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 完全没有学习过GAN,看了CSDN的一篇文章,里面有代码但是完全不知道如何操作
  • ¥15 使用ue5插件narrative时如何切换关卡也保存叙事任务记录
  • ¥20 软件测试决策法疑问求解答
  • ¥15 win11 23H2删除推荐的项目,支持注册表等
  • ¥15 matlab 用yalmip搭建模型,cplex求解,线性化处理的方法
  • ¥15 qt6.6.3 基于百度云的语音识别 不会改
  • ¥15 关于#目标检测#的问题:大概就是类似后台自动检测某下架商品的库存,在他监测到该商品上架并且可以购买的瞬间点击立即购买下单
  • ¥15 神经网络怎么把隐含层变量融合到损失函数中?
  • ¥15 lingo18勾选global solver求解使用的算法
  • ¥15 全部备份安卓app数据包括密码,可以复制到另一手机上运行