dongshicuo4844 2014-06-13 19:24
浏览 36
已采纳

使用Laravel&Elvedia \ Goutte抓取网站:如何提取JSON

I managed to access succesfully a remote JSON resource using Goutte Laravel 4:

$client = Goutte::getNewClient();

//*
$crawler = $client->request('GET', 'http://domain.mg/admin');

$form = $crawler->selectButton('Login')->form();
$crawler = $client->submit($form, array('username' => 'username', 'password' => 'password'));

//*/

$crawler = $client->request('GET', 'http://domain.mg/usergroup/list'); // Yields JSON Response

return dd($crawler);

It yields an output like so:

object(Symfony\Component\DomCrawler\Crawler)#285 (4) { ["uri":protected]=> string(36) "http://domain.mg/usergroup/list" ["defaultNamespacePrefix":"Symfony\Component\DomCrawler\Crawler":private]=> string(7) "default" ["namespaces":"Symfony\Component\DomCrawler\Crawler":private]=> array(0) { } ["storage":"SplObjectStorage":private]=> array(1) { ["0000000075faaa10000000001af55ef8"]=> array(2) { ["obj"]=> object(DOMElement)#241 (17) { ["tagName"]=> string(4) "html" ["schemaTypeInfo"]=> NULL ["nodeName"]=> string(4) "html" ["nodeValue"]=> string(438) "[{"id":1,"group_name":"Compte principal","group_desc":"Administrateur","group_level":9},{"id":2,"group_name":"Profil pour les comptables","group_desc":"Comptables","group_level":2},{"id":3,"group_name":"Validateur d'op\u00e9ration","group_desc":"Superviseur","group_level":9},{"id":18,"group_name":"No Comment","group_desc":"Autres employ\u00e9s","group_level":6},{"id":41,"group_name":"Invit\u00e9","group_desc":"Guest","group_level":2}]" ["nodeType"]=> int(1) ["parentNode"]=> string(22) "(object value omitted)" ["childNodes"]=> string(22) "(object value omitted)" ["firstChild"]=> string(22) "(object value omitted)" ["lastChild"]=> string(22) "(object value omitted)" ["previousSibling"]=> string(22) "(object value omitted)" ["attributes"]=> string(22) "(object value omitted)" ["ownerDocument"]=> string(22) "(object value omitted)" ["namespaceURI"]=> NULL ["prefix"]=> string(0) "" ["localName"]=> string(4) "html" ["baseURI"]=> NULL ["textContent"]=> string(438) "[{"id":1,"group_name":"Compte principal","group_desc":"Administrateur","group_level":9},{"id":2,"group_name":"Profil pour les comptables","group_desc":"Comptables","group_level":2},{"id":3,"group_name":"Validateur d'op\u00e9ration","group_desc":"Superviseur","group_level":9},{"id":18,"group_name":"No Comment","group_desc":"Autres employ\u00e9s","group_level":6},{"id":41,"group_name":"Invit\u00e9","group_desc":"Guest","group_level":2}]" } ["inf"]=> NULL } } }

I stumbled at extracting/converting the internal representation of the JSON within $crawler object. How could that be done?

  • 写回答

2条回答 默认 最新

  • duanbigan7765 2014-06-14 11:48
    关注

    Delving into Class Symfony\Component\DomCrawler\Crawler documentation, I found

    public string html()
    
        Returns the first node of the list as HTML.
    
        Return Value
    
        string  The node html
    

    which works as I expected.

    Turning return dd($crawler) into return ($crawler->html()) yields:

    [{"id":1,"group_name":"Compte principal","group_desc":"Administrateur","group_level":9},{"id":2,"group_name":"Profil pour les comptables","group_desc":"Comptables","group_level":2},{"id":3,"group_name":"Validateur d'op\u00e9ration","group_desc":"Superviseur","group_level":9},{"id":18,"group_name":"No Comment","group_desc":"Autres employ\u00e9s","group_level":6},{"id":41,"group_name":"Invit\u00e9","group_desc":"Guest","group_level":2}]

    Conclusion

    Goutte managed very well the complex (Laravel | crsf mechanism) Login process but I dislike striping JSON string using html().

    Using return ($crawler->text()) getting at the same outcome is more "neutral" my opinion to.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 关于#python#的问题:求帮写python代码
  • ¥20 MATLAB画图图形出现上下震荡的线条
  • ¥15 LiBeAs的带隙等于0.997eV,计算阴离子的N和P
  • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘
  • ¥15 来真人,不要ai!matlab有关常微分方程的问题求解决,
  • ¥15 perl MISA分析p3_in脚本出错
  • ¥15 k8s部署jupyterlab,jupyterlab保存不了文件
  • ¥15 ubuntu虚拟机打包apk错误
  • ¥199 rust编程架构设计的方案 有偿
  • ¥15 回答4f系统的像差计算