douzai1074 2011-12-25 23:50
浏览 53
已采纳

如何制作一个小的PHP链接“蜘蛛”并提取数据?

I want to spider a simple white website that has lot's of html links that represent a phone number' name and address. From each page i want to extract the exact 3 fields that are between the 3 TD's such as:

    <div id="idTabResults2" align="center">
        <TABLE border='1'>
    <tr><th>Name</th><th>Adress</th><th>Phone number</th></tr>
    <TR>
          <TD>Joe</TD><TD>New York</TD><TD>555999</TD></TR>
    </TABLE>

    </div>

So in the example above i would get "Joe", "New York" & 555999. I'm using php and mysql later to insert every result to my DB. Can someone point me to the right direction on how to go about this?

  • 写回答

2条回答 默认 最新

  • douyu5679 2011-12-25 23:56
    关注

    Maybe a faster (and simpler) way than PeeHaa's solution:

    For instance:

    <?php
    require("simple_html_dom.php");
    $data = file_get_contents(YOUR_PAGE_HERE);
    $html = str_get_html($data);
    $tds = $html->find('td');
    
    foreach ($tds as $td) {
      // Do something
    }
    ?> 
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥20 易康econgnition精度验证
  • ¥15 msix packaging tool打包问题
  • ¥28 微信小程序开发页面布局没问题,真机调试的时候页面布局就乱了
  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败