duanji7182 2013-05-05 10:50
浏览 56
已采纳

PHP - 通过DOM解析html表

So I am using the PHP Simple HTML DOM Parser and I am trying to get the table list of Top Goalscorers from this webpage: http://www.transfermarkt.co.uk/en/chinese-super-league/startseite/wettbewerb_CSL.html (it's the top 5...)

I am trying to parse the table Top Goal Scorers and that has the ID of "spieler". In doing so, I want to get each table row and list them on my own. The problem is... below Name / Club... there is a new <table> to make the image, name and club name easier to display on a webpage.

I am trying to figure out the DOM so I can see what I need to select and get the right player name, club name and the goals. Thanks.

Here's what I have so far:

<textarea id='txt_out'>
<?php
echo "Player | Team | Goals
:--|:--|:--:
";

$url = "http://www.transfermarkt.co.uk/en/chinese-super-league/startseite/wettbewerb_CSL.html";
$html = file_get_html($url);

foreach($html->find('#spieler') as $row) {

  if ($i > 0) {
   $player = $row->find('table tr',3)->plaintext;
        echo $player . "|TEST TEAM|0";
    }
   $i++;
}
?>
</textarea>

and this echo returns blank.

<textarea id="txt_out">Player | Team | Goals
:--|:--|:--:
</textarea>
  • 写回答

2条回答 默认 最新

  • duanci9305 2013-05-05 11:34
    关注

    There you go (you have to play with the attributes a bit to get your desire output): In this solution I just take all the tds and get the plaintext of the them after I checked they don't include the inner table in them.

    $output = '<table border="1">
                    <tr>
                        <td>#</td>
                        <td>Player</td>
                        <td>Team</td>
                        <td>goals-1</td>
                        <td>goals-2</td>
                        <td>goals-3</td>
                        <td>points</td>
                    </tr>
                ';
    
    $url = "http://www.transfermarkt.co.uk/en/chinese-super-league/startseite/wettbewerb_CSL.html";
    $html = file_get_html($url);
    
    $tbl = $html->find('#spieler',0);
    
    $trs = $tbl->find('tr[class=dunkel],tr[class=hell]');
    
    foreach($trs as $tr){
        $output .= '<tr>';
        $tds = $tr->find('td');
        foreach($tds as $td){
            $inner_table = $td->find('table',0);
            if(!$inner_table){  
                $text = trim($td->plaintext);
                if($text != ''){
                    $output .= '<td>' . $td->plaintext . '</td>';
                }
            }  
        }
        $output .= '</tr>';
    }
    
    $output .= '</table>';
    
    echo($output);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大
  • ¥15 Oracle中如何从clob类型截取特定字符串后面的字符
  • ¥15 想通过pywinauto自动电机应用程序按钮,但是找不到应用程序按钮信息
  • ¥15 如何在炒股软件中,爬到我想看的日k线
  • ¥15 seatunnel 怎么配置Elasticsearch
  • ¥15 PSCAD安装问题 ERROR: Visual Studio 2013, 2015, 2017 or 2019 is not found in the system.
  • ¥15 (标签-MATLAB|关键词-多址)
  • ¥15 关于#MATLAB#的问题,如何解决?(相关搜索:信噪比,系统容量)
  • ¥500 52810做蓝牙接受端