dongzi8191 2015-12-03 02:45
浏览 28
已采纳

简单的HTML DOM从标题内获取href和锚文本

For starters this is the code that I have

    <?php
    include ('parser_class.php');
        $source = file_get_html('http://www.billboard.com/search/site/awards?f[0]=ss_bb_type%3Aarticle');
        $title = $source->find('h3.title'); //getting song title
    ?>
    <div id="awar">
    <?php
        if ($title){
            $title = array_slice($title, 0, 10);
            foreach($title as $titles){
                $links = $titles->href;
                $string = $titles->innertext;
                //$string = (strlen($string) > 75) ? substr($string,0,72).'...' : $string;
    ?>
            <center>
            <table style="width: 100%;">
                <tr>
                    <td style="width: 50%; text-align: left; padding-left: 5px;"><span class="song"><?php echo $string ?></span></td><td style="width: 25%; text-align: left; padding-left: 5px;"><a href="http://www.billboard.com<?php echo $links ?>" class="download">Read Article</a></td>
                </tr>
            </table>
            </center>
            <hr class="betw" />

    <?php
            }
        }
        else{
            echo"<p class='song'>No Articles Found</p>";
        }
    ?>

Since the website has no classes on their links I am having to pull my information from something like this

<h3 class="title"> <a href="/articles/columns/country/6784891/lady-antebellum-charles-kelley-steps-out-on-his-own">Lady Antebellum's Charles Kelley Steps Out On His Own In New York City</a> </h3>

Calling for innertext I get everything within the h3

What I need is to figure out how to get the href and the anchor text separately from within the h3

Is there a way to get the href from the innertext and then the innertext of the href?

I wish that this site had a class on their links as that would of course make this tons easier. I have used these functions with no issues because of the websites actually using classes on their links, but it looks like billboard has decided to make things harder for me!

A point in the right direction would be greatly appreciated.

NOTE: My parser_class.php is the one that is located here

  • 写回答

1条回答 默认 最新

  • dtmbc1606 2015-12-03 03:32
    关注

    Instead of h3 with class title you have to select the anchor. so h3.title a now from that anchor you will get the href and anchor text. In order to get the href you can create SimpleXMLElement object from the anchor html.

     <?php
        include ('parser_class.php');
        $source = file_get_html('http://www.billboard.com/search/site/awards?f[0]=ss_bb_type%3Aarticle');
        foreach ($source->find('h3.title a') as $anchor) {
            $anch = new SimpleXMLElement($anchor);
            echo "Anchor text is : ".$anch;
            echo "<br>";
            echo "href is : ";
            echo $link_href = $anch['href'];
            echo "<hr>";
        }
      ?>
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 DS18B20内部ADC模数转换器
  • ¥15 做个有关计算的小程序
  • ¥15 MPI读取tif文件无法正常给各进程分配路径
  • ¥15 如何用MATLAB实现以下三个公式(有相互嵌套)
  • ¥30 关于#算法#的问题:运用EViews第九版本进行一系列计量经济学的时间数列数据回归分析预测问题 求各位帮我解答一下
  • ¥15 setInterval 页面闪烁,怎么解决
  • ¥15 如何让企业微信机器人实现消息汇总整合
  • ¥50 关于#ui#的问题:做yolov8的ui界面出现的问题
  • ¥15 如何用Python爬取各高校教师公开的教育和工作经历
  • ¥15 TLE9879QXA40 电机驱动