递归问题

I'm grabbing links from a website, but I'm having a problem in which the higher I set the recursion depth for the function the results become stranger

for example when I set the function to the following

crawl_page("http://www.mangastream.com/", 10);

I will get a results like this for about half the page

http://mangastream.com/read/naruto/51619850/1/read/naruto/51619850/2/read/naruto/51619850/2/read/naruto/51619850/2/read/naruto/51619850/2/read/naruto/51619850/2/read/naruto/51619850/2/read/naruto/51619850/2

EDIT

while I'm expecting results like this instead

http://mangastream.com/manga/read/naruto/51619850/1

here's the function I've been using to get the results

function crawl_page($url, $depth)
{
    static $seen = array();
    if (isset($seen[$url]) || $depth === 0) {
        return;
    }
    $seen[$url] = true;

    $dom = new DOMDocument('1.0');
    @$dom->loadHTMLFile($url);

    $anchors = $dom->getElementsByTagName('a');
    foreach ($anchors as $element) {
        $href = $element->getAttribute('href');
        if (0 !== strpos($href, 'http')) {
            $href = rtrim($url, '/') . '/' . ltrim($href, '/');
        }
         if(shouldScrape($href)==true)   
          crawl_page($href, $depth - 1);
    }
    echo $url,"";
//,pageStatus($url)
}

any help with this would be greatly appreciated

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dseax40600 2011-04-15 15:59
关注
the construction of your new url is not correct, replace :

$href = rtrim($url, '/') . '/' . ltrim($href, '/');

with :

if (substr($href, 0, 1)=='/') { // href relative to root $info = parse_url($url); $href = $info['scheme'].'//'.$info['host'].$href; } else { // href relative to current path $href = rtrim(dirname($url), '/') . '/' . $href; }
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

PHP递归函数问题 php
2017-05-18 09:51

回答 3 已采纳 Well, this is a bit awkward and I'm still not entirely sure if this is the right motive, but I sol
Php问题，用递归函数解决 php
2022-04-10 22:10

回答 1 已采纳 <?php var_dump(cc(5)); function cc($n,$m=0){ if($n==1){ return $m += 10; }elseif
递归cURL函数php php
2018-08-08 07:39

回答 4 已采纳 You have 2 problems: You don't have return statement when performing recursive call In case you
PHP递归实现汉诺塔问题的方法示例
2020-10-18 23:12

主要介绍了PHP递归实现汉诺塔问题的方法,简单描述了汉诺塔问题,并结合实例形式分析了php基于递归算法解决汉诺塔问题的相关操作技巧,需要的朋友可以参考下
PHP递归练习构建数组 php
2018-03-17 20:25

回答 2 已采纳 You could use an array to hold the keys of object, instead of using a string, and array_pop() to r
PHP数组合并递归问题 php
2016-07-25 02:53

回答 2 已采纳 The empty first result comes from your definition: $apiarray = array( "response" => array(
php递归如何实现字符串的排列? php
2017-02-23 03:16

回答 3 已采纳 ``` string(12) "acdakjflsdaf" [1] => string(12) "acdakjflsdaf" [2] => st
php递归json类实例
2020-12-18 20:04

本文实例讲述了php递归json类的实现方法。分享给大家供大家参考。具体实现代码如下：复制代码代码如下:<?php /* * @ anthor:QD * @ time: 2013-09-27 */ class json{ private $Arr = array();...
PHP递归语法 php
2014-08-05 17:41

回答 1 已采纳 No recursion necessary since the keys must be unique and we know what they are, just loop through
递归解析php php
2013-08-11 04:13

回答 1 已采纳 Testing: $result[] = self::get_content_html_render_LOM( $cursor , $handlebars_instance , $template
递归PHP数组 php
2016-04-12 00:23

回答 2 已采纳 Had some free time so I created a loop.. A not so clean solution, I don't recommend it but this w
PHP 递归效率分析
2020-10-29 13:49

PHP的递归效率一般认为是低效的。大概一年前，我写了一篇博文，对三种遍历树的方法进行了比较，发现递归算法的效率最低。
PHP实现数组递归转义的方法
2020-10-25 12:18

主要介绍了PHP实现数组递归转义的方法,包含了数组的递归调用与字符串的转义方法,需要的朋友可以参考下
php递归创建目录的方法
2020-10-24 18:48

主要介绍了php递归创建目录的方法,实例分析了采用递归创建目录的技巧及使用三元运算符的实现方法,需要的朋友可以参考下
php使用递归计算文件夹大小
2020-12-19 10:03

方法很简单，这里就不多废话了，直接奉上代码：复制代码代码如下: protected function dir_size($dir){ $dh = opendir($dir); //打开目录，返回一个目录流 $size = 0; //初始大小为0 while(false !...
没有解决我的问题, 去提问

悬赏问题

¥20 基于MSP430f5529的MPU6050驱动，求出欧拉角
¥20 Java-Oj-桌布的计算
¥15 powerbuilder中的datawindow数据整合到新的DataWindow
¥20 有人知道这种图怎么画吗？
¥15 pyqt6如何引用qrc文件加载里面的的资源
¥15 安卓JNI项目使用lua上的问题
¥20 RL+GNN解决人员排班问题时梯度消失
¥60 要数控稳压电源测试数据
¥15 能帮我写下这个编程吗
¥15 ikuai客户端l2tp协议链接报终止15信号和无法将p.p.p6转换为我的l2tp线路

递归问题

2条回答 默认 最新

悬赏问题

2条回答默认最新