普通网友 2015-10-20 09:01
浏览 31
已采纳

无法从其他页面获取准确的值

I am trying to get score table from this page http://www.skysports.com/football/competitions/bundesliga/table. I do this with

$bundes = file('http://www.skysports.com/football/competitions/bundesliga/table');

And when i try to display array $bundes i do it with this:

echo '<pre>', print_r($bundes), '</pre>';

The code witch i try do display is displayed like this:

[1437] => 
[1022] => German Bundesliga 2015/16
#   Team    Pl  W   D   L   F   A   GD  Pts Last 6
1   [1059] => [1060] => Bayern Munich [1061] => [1062] =>   9   9   0   0   29  4   25  27  [1072] =>
[1073] =>
[1074] =>

This is the first row of table. And now i can display $bundes[1060] and i get output of Bayer Munich but how can i get values from $bundes[1062], values are 9, 9, 0, 0, 29, 4, 25 and 27? I need to display each of this values in <td></td> When i try to echo $bundes[1062] i get nothing.

  • 写回答

1条回答 默认 最新

  • dongwang6837 2015-10-20 10:00
    关注

    A more reliable way of extracting the data is using DOM manipulation classes to do something like:

    $doc = new \DOMDocument();
    @$doc->loadHTMLFile('http://www.skysports.com/football/competitions/bundesliga/table');
    
    $xpath = new \DOMXPath($doc);
    $rows = $xpath->query('//tbody/tr');
    
    $data = [];
    
    foreach ($rows as $i => $row) {
        $columns = $xpath->query('td', $row);
    
        foreach ($columns as $column) {
            $data[$i][] = trim($column->textContent);
        }
    }
    
    print_r($data);
    

    Which gives you:

    Array
    (
        [0] => Array
            (
                [0] => 1
                [1] => Bayern Munich
                [2] => 9
                [3] => 9
                [4] => 0
                [5] => 0
                [6] => 29
                [7] => 4
                [8] => 25
                [9] => 27
                [10] => 
            )
    ...
    

    Regarding Dagon's comment, no terms can disallow crawling and extracting the data (as long as you do so at a reasonable rate that does not impact the website's performance). Terms of use & copyright law, however, do dictate what you can and cannot do with the crawled content (ex. republish).

    Web scraping may be against the terms of use of some websites. The enforceability of these terms is unclear (see "FAQ about linking – Are website terms of use binding contracts?").

    - Wikipedia, Web scraping: Legal issues

    BTW, the pages robots meta tag does allow INDEX.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)
  • ¥15 Vue3地图和异步函数使用
  • ¥15 C++ yoloV5改写遇到的问题
  • ¥20 win11修改中文用户名路径
  • ¥15 win2012磁盘空间不足,c盘正常,d盘无法写入
  • ¥15 用土力学知识进行土坡稳定性分析与挡土墙设计
  • ¥70 PlayWright在Java上连接CDP关联本地Chrome启动失败,貌似是Windows端口转发问题
  • ¥15 帮我写一个c++工程
  • ¥30 Eclipse官网打不开,官网首页进不去,显示无法访问此页面,求解决方法
  • ¥15 关于smbclient 库的使用