dougou8639 2016-08-25 12:52
浏览 51

PHP / Ajax匿名函数,迭代会导致错误吗? +如何在使用Ajax时显示标准的PHP错误

I've developed a web scraper on one server, which works and does what I want it to do. Now I have to implement it in another environment and I've stumbled on an issue I did not have when developing, which I am having a hard to identifying.

The only real error I have to go on is (from JS console):

POST http://my.cool.page/pro/company/scrape 502 (Bad Gateway)

The development server (where it works) is using PHP 5.4.16, implementation server is on PHP 5.4.45. I am using the same versions of external code on both servers.

The circumstances for launching the scraper are a bit different in implementation, it's now being loaded through Ajax rather than as its own page.

The ajax call:

$("#showScraperButton").click(function(){
            $.post('/pro/company/scrape',
            {
                'url': url
            },
            function(result){
                //code...
            }
            );
        });

Function + case for scraping anchor tags, using Fabpot/Goutte:

function _getTagContent($crawler = '', $toScrape = '', $contentPatterns = '')
    {
        $tagContent = array();
        ChromePhp::log("Hello _getTagContent");
        foreach($toScrape as $tag) {
            $i = 0;
            switch ($tag) {
            case 'a':
                $n = $i;
                $crawler->filter($tag)->each(
                function ($node) use(&$tagContent, &$n, &$tag, &$crawler)
                {
                    $nodeText = trim($node->text());
                    $tagContent[$tag][$n]['value'] = $nodeText;
                    $linksCrawler = $crawler->selectLink($nodeText);
                    try {
                        $link = $linksCrawler->link();
                        $magicDidHappen = true;
                    }

                    catch(Exception $e) {
                        $magicDidHappen = false;
                    }

                    if ($magicDidHappen) {
                        $uri = $link->getUri();
                    }
                    else {
                        $uri = $node->attr('href');
                    }

                    $tagContent[$tag][$n]['uri'] = $uri;
                    $n++;
                });
                break;

            default:
                break;
            }
        }
        return $tagContent;
    }

This results in the error described above.

By commenting out each line in the case, I found that the error does not show until

$n++;

is called. If

$n++;

is NOT included, the final a element is indeed present in $tagContent.

This led me to believe that the attempt at iteration is the problem in this case, and that the code otherwise does not throw errors. I then tried with a different html tag, using similar syntax:

case 'h3':
    $n = $i;
    $crawler->filter($tag)->each(
    function ($node) use(&$tagContent, &$n, &$tag)
    {
        $tagContent[$tag][$n] = trim($node->text());
        $n++;
    });
break;

However, this works as intended, giving me all 40 instances of h3 on the page I'm scraping.

From this I have some questions: Please help? Could it be related to PHP versions? Is there a way to print the "standard" PHP errors when doing Ajax calls (instead of/in addition to http response codes), as I'm sure there is a hint to be found there as to what is failing. Thanks much for any help!

  • 写回答

1条回答 默认 最新

  • dqk42179 2016-08-29 07:20
    关注

    It now works using

    case 'a':
                    $crawler->filter($tag)->each(
                    function ($node, $n) use (&$tagContent, &$tag, &$crawler)
                    {
    
                        $nodeText = trim($node->text());
    
    
                        $tagContent[$tag][$n]['value'] = $nodeText;
    
                        $linksCrawler = $crawler->selectLink($nodeText);
    
                        try {
                            $link = $linksCrawler->link();
                            $magicDidHappen = true;
                        }
    
                        catch(Exception $e) {
                            $magicDidHappen = false;
                        }
    
                        if ($magicDidHappen) {
                            $uri = $link->getUri();
                        }
                        else {
                            $uri = $node->attr('href');
                        }
    
                        $tagContent[$tag][$n]['uri'] = $uri;
                        $n++;
    
                    });
                    break;
    

    Moved $n out of the using() statement and into the function parameters. I believe ChromePhp might have been causing some issues here. Still don't really know what went wrong. But now it works...

    评论

报告相同问题?

悬赏问题

  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
  • ¥15 ETLCloud 处理json多层级问题
  • ¥15 matlab中使用gurobi时报错
  • ¥15 这个主板怎么能扩出一两个sata口
  • ¥15 不是,这到底错哪儿了😭
  • ¥15 2020长安杯与连接网探
  • ¥15 关于#matlab#的问题:在模糊控制器中选出线路信息,在simulink中根据线路信息生成速度时间目标曲线(初速度为20m/s,15秒后减为0的速度时间图像)我想问线路信息是什么