douwei7501 2012-06-30 21:59
浏览 60
已采纳

PHP - 有一种安全的方法来执行深度递归吗?

Im talking about performing a deep recursion for around 5+ mins, something that you may have a crawler perform. in order to extract url links and and sub-url links of pages

it seems that deep recursion in PHP does not seem realistic

e.g.

getInfo("www.example.com");

function getInfo($link){
   $content = file_get_content($link)

   if($con = $content->find('.subCategories',0)){
      echo "go deeper<br>";
      getInfo($con->find('a',0)->href);
   }

   else{
      echo "reached deepest<br>";
   }
}
  • 写回答

1条回答 默认 最新

  • duangang1991 2012-06-30 22:02
    关注

    Doing something like this with recursion is actually a bad idea in any language. You cannot know how deep that crawler will go so it might lead to a Stack Overflow. And if not it still wastes a bunch of memory for the huge stack since PHP has no tail-calls (not keeping any stack information unless necessary).

    Push the found URLs into a "to crawl" queue which is checked iteratively:

    $queue = array('www.example.com');
    $done = array();
    while($queue) {
        $link = array_shift($queue);
        $done[] = $link;
        $content = file_get_contents($link);
        if($con = $content->find('.subCategories', 0)) {
            $sublink = $con->find('a', 0)->href;
            if(!in_array($sublink, $done) && !in_array($sublink, $queue)) {
                $queue[] = $sublink;
            }
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥60 pb数据库修改或者求完整pb库存系统,需为pb自带数据库
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错
  • ¥15 单片机学习顺序问题!!
  • ¥15 ikuai客户端多拨vpn,重启总是有个别重拨不上
  • ¥20 关于#anlogic#sdram#的问题,如何解决?(关键词-performance)
  • ¥15 相敏解调 matlab
  • ¥15 求lingo代码和思路