dongzi8191 2015-12-03 02:45
浏览 28
已采纳

简单的HTML DOM从标题内获取href和锚文本

For starters this is the code that I have

    <?php
    include ('parser_class.php');
        $source = file_get_html('http://www.billboard.com/search/site/awards?f[0]=ss_bb_type%3Aarticle');
        $title = $source->find('h3.title'); //getting song title
    ?>
    <div id="awar">
    <?php
        if ($title){
            $title = array_slice($title, 0, 10);
            foreach($title as $titles){
                $links = $titles->href;
                $string = $titles->innertext;
                //$string = (strlen($string) > 75) ? substr($string,0,72).'...' : $string;
    ?>
            <center>
            <table style="width: 100%;">
                <tr>
                    <td style="width: 50%; text-align: left; padding-left: 5px;"><span class="song"><?php echo $string ?></span></td><td style="width: 25%; text-align: left; padding-left: 5px;"><a href="http://www.billboard.com<?php echo $links ?>" class="download">Read Article</a></td>
                </tr>
            </table>
            </center>
            <hr class="betw" />

    <?php
            }
        }
        else{
            echo"<p class='song'>No Articles Found</p>";
        }
    ?>

Since the website has no classes on their links I am having to pull my information from something like this

<h3 class="title"> <a href="/articles/columns/country/6784891/lady-antebellum-charles-kelley-steps-out-on-his-own">Lady Antebellum's Charles Kelley Steps Out On His Own In New York City</a> </h3>

Calling for innertext I get everything within the h3

What I need is to figure out how to get the href and the anchor text separately from within the h3

Is there a way to get the href from the innertext and then the innertext of the href?

I wish that this site had a class on their links as that would of course make this tons easier. I have used these functions with no issues because of the websites actually using classes on their links, but it looks like billboard has decided to make things harder for me!

A point in the right direction would be greatly appreciated.

NOTE: My parser_class.php is the one that is located here

  • 写回答

1条回答 默认 最新

  • dtmbc1606 2015-12-03 03:32
    关注

    Instead of h3 with class title you have to select the anchor. so h3.title a now from that anchor you will get the href and anchor text. In order to get the href you can create SimpleXMLElement object from the anchor html.

     <?php
        include ('parser_class.php');
        $source = file_get_html('http://www.billboard.com/search/site/awards?f[0]=ss_bb_type%3Aarticle');
        foreach ($source->find('h3.title a') as $anchor) {
            $anch = new SimpleXMLElement($anchor);
            echo "Anchor text is : ".$anch;
            echo "<br>";
            echo "href is : ";
            echo $link_href = $anch['href'];
            echo "<hr>";
        }
      ?>
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 安卓adb backup备份应用数据失败
  • ¥15 eclipse运行项目时遇到的问题
  • ¥15 关于#c##的问题:最近需要用CAT工具Trados进行一些开发
  • ¥15 南大pa1 小游戏没有界面,并且报了如下错误,尝试过换显卡驱动,但是好像不行
  • ¥15 没有证书,nginx怎么反向代理到只能接受https的公网网站
  • ¥50 成都蓉城足球俱乐部小程序抢票
  • ¥15 yolov7训练自己的数据集
  • ¥15 esp8266与51单片机连接问题(标签-单片机|关键词-串口)(相关搜索:51单片机|单片机|测试代码)
  • ¥15 电力市场出清matlab yalmip kkt 双层优化问题
  • ¥30 ros小车路径规划实现不了,如何解决?(操作系统-ubuntu)