dongleman4760 2013-11-20 01:40
浏览 181
已采纳

使用PhantomJS访问网站


I'm trying PhantomJS for the first time, and would like to download a remote site with PHP for SEO purposes.

I've succeed in downloading the HTML content, however the pages are always "Javascript not enabled" fallbacks. From this I can only conclude that the PhantomJS is visiting the sites without Javascript support.. I've posted the script I'm currently using below, which should be pretty standard. Does anyone know a better way of returning remote HTML content with PhantomJS?

phantom.js

var page = require('webpage').create();
var system = require('system');
var url = system.args[1];

page.open(url,
    function(status){
        if (status !== 'success') {
            phantom.exit(1);
            return;
        } else {
            page.evaluate(
                function() { 
                    return document.documentElement.outerHTML;
                }, 
                function(result){
                    console.log(result);
                }); 
        }
        phantom.exit();
    });

index.php

$url = escapeshellarg('<some url to test>');
$script = "phantom.js";
$contents = shell_exec("/usr/local/bin/phantomjs $script $url");
  • 写回答

1条回答 默认 最新

  • dourong9253 2013-11-20 03:35
    关注

    How about simply using page.content? Does this work:

    var page = require('webpage').create();
    var system = require('system');
    var url = system.args[1];
    
    page.open(url,
        function(status){
            if (status !== 'success') {
                console.log("FAILED:"+status);
                } 
            else{
                console.log(page.content);
                } 
            phantom.exit();
        });
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?