dporb84480 2014-06-25 21:53
浏览 16
已采纳

解析多个结果页面

I am trying to parse a website to obtain aeronautical information. Here is the link to the website

NOTAMS

The problem is the script shows only the first 20 entries ( a single pages worth) from the the entire result set (38) and I am unable to iterate through others pages, as the next url doesn't have any page information. here is my code:

$DOM = new DOMDocument();
$DOM->loadHTMLFile($url);


$selector = new DOMXPath($DOM);
$elements = $selector->query('//td[@headers="notam"]'); 


if (!is_null($elements)) {
  foreach ($elements as $element) {
    echo "<br/>";//[". $element->nodeName. "]";
    $nodes = $element->childNodes;
    foreach ($nodes as $node) {
       $display = $node->nodeValue;
    if($display <> NULL){
            echo $node->nodeValue. "<br>";
    }
     }
   }
 }
  • 写回答

1条回答 默认 最新

  • doujianwei8217 2014-06-25 22:47
    关注

    I can't really see the problem with iterating in the collection... If you add page=x in the query string, it will definitely work. Could you give us some more details?

    Example page 1 Example page 2 (with ...?page=2..., the rest of the url does not change)

    As seen in example 1, you can use page=1 in your first url, which makes your code even easier.

    Depending on your service requirements, I would also consider using Kimonolabs website to api service, so you would end up querying a simple json api, using a library such as Guzzle to make your code maintainable and easy to read.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 深度学习残差模块模型
  • ¥20 两个不同Subnet的点对点连接
  • ¥50 怎么判断同步时序逻辑电路和异步时序逻辑电路
  • ¥15 差动电流二次谐波的含量Matlab计算
  • ¥15 Can/caned 总线错误问题,错误显示控制器要发1,结果总线检测到0
  • ¥15 C#如何调用串口数据
  • ¥15 MATLAB与单片机串口通信
  • ¥15 L76k模块的GPS的使用
  • ¥15 请帮我看一看数电项目如何设计
  • ¥23 (标签-bug|关键词-密码错误加密)