dot_0620 2016-06-13 10:04 采纳率: 100%
浏览 40

简单的Html Dom - 获取Google索引页面

I am trying to get the indexed pages in Google via Siterequest:

Therefore i got a function called get_serps:

public function get_serps($pages, $start, $query)
    {
    //Added temporaly for Debug
    $query = 'site:test.de';
    //Get Simple Html Dom
    $parser = $this->container->get('simple_html_dom');

    $googleurl = 'http://www.google.de/search?num=100&start='.$start.'&hl=de&safe=off&q='.$query;
    echo "<pre>" . $googleurl . "</pre>";
    $html = $parser->file_get_html($googleurl);

    foreach ($html->find('#ires g r a') as $link) {
        echo '</br> 2 </br>';
        $linkurl = $link->href;
        echo $linkurl.'</br>';
        preg_match_all('#http(s)?://\b[^&]*(.*?)#', $linkurl, $target);
        ++$count;
    }

    $next = $parser->modified_find('#nav tbody tr', 0);
    $next = is_object($next) ? $next->last_child() : '';
    echo $next;
    if (!empty($next) && $next->find('a')) {
        $parser->clear();
        unset($parser);
        $this->get_serps($pages, $start + 100, $query);
    } else {
        echo 'Count: '. $count;
        return $count;
    }
}

The Problem that the find('#ires g r a') doesn't get any results.

Just an empty array...

The find function is from the Simple Html Dom Package

This is the Error i am getting:

Call to a member function modified_find() on null

The reason is that in the find function got returned an empty array. But i dont got any idea why the function cannot find anything.

function find($selector, $idx=null, $lowercase=false)
{
    echo 'Selector: ' . $selector . '</br>';
    $selectors = $this->parse_selector($selector);

    if (($count=count($selectors))===0) return array();
    $found_keys = array();

    // find each selector
    for ($c=0; $c<$count; ++$c)
    {
        // The change on the below line was documented on the sourceforge code tracker id 2788009
        // used to be: if (($levle=count($selectors[0]))===0) return array();
        if (($levle=count($selectors[$c]))===0) return array();
        if (!isset($this->_[HDOM_INFO_BEGIN])) return array();

        $head = array($this->_[HDOM_INFO_BEGIN]=>1);

        // handle descendant selectors, no recursive!
        for ($l=0; $l<$levle; ++$l)
        {
            $ret = array();
            foreach ($head as $k=>$v)
            {
                $n = ($k===-1) ? $this->dom->root : $this->dom->nodes[$k];
                //PaperG - Pass this optional parameter on to the seek function.
                $n->seek($selectors[$c][$l], $ret, $lowercase);
            }
            $head = $ret;
        }

        foreach ($head as $k=>$v)
        {
            if (!isset($found_keys[$k]))
            {
                $found_keys[$k] = 1;
            }
        }
    }

    // sort keys
    ksort($found_keys);

    $found = array();
    foreach ($found_keys as $k=>$v)
        $found[] = $this->dom->nodes[$k];
        var_dump($found);

    // return nth-element or array
    if (is_null($idx)) return $found;
    else if ($idx<0) $idx = count($found) + $idx;

    return (isset($found[$idx])) ? $found[$idx] : null;

}

The whole think is built in the Symfony Framework!

  • 写回答

1条回答 默认 最新

  • dongyan5706 2016-06-13 10:49
    关注
    Call to a member function modified_find() on null  
    

    The error clearly states, that find() is not the problem, but the fact that you call it on a null-object

    $next = $parser->modified_find('#nav tbody tr', 0);
    

    in that lime $parser is not defined. THe key is this:

    $html = $parser->file_get_html($googleurl);
    

    you get your result in $html instead of $parser and therefor you need to use find on that:

    $next = $html->modified_find('#nav tbody tr', 0);
    
    评论

报告相同问题?

悬赏问题

  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题