普通网友 2015-09-11 14:54
浏览 147

PHP file_get_contents和CURL无法获取基于javascript的页面的内容

I have tried both file_get_contents and CURL to fetch the content of a specific page. I setup the CURL to follow redirects and changed User-Agent, however, it did not work. I have no problem when loading page in browser. I get a page with below code whenever I try to fetch it with file_get_contents or CURL:

 <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Loading ...</title> <script 
src="/jquery.js" type="text/javascript"></script> </head> <html> <noscript>Enable java!</noscript> <div id="status"></div> <script type="text/javascript"> 
function check(){ $.ajax({
    type: "POST",
    url: "/index.php",
    data: "allowed=5b91b80a061537ae6a23835aba38279e",
    success: function(html){if(html == "allowed"){location.reload();}},
    beforeSend:function(){
        $("#status").html("Loading ...")
    }
});
}
$(document).ready(function(){
    check();
});
</script> </html> 

Is there anyway to bypass such restriction without using Javascript based Web Scrapers like PhantomJs, CasperJs or ZombieJs? Simply using a plain PHP?

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 没有证书,nginx怎么反向代理到只能接受https的公网网站
    • ¥50 成都蓉城足球俱乐部小程序抢票
    • ¥15 yolov7训练自己的数据集
    • ¥15 esp8266与51单片机连接问题(标签-单片机|关键词-串口)(相关搜索:51单片机|单片机|测试代码)
    • ¥15 电力市场出清matlab yalmip kkt 双层优化问题
    • ¥30 ros小车路径规划实现不了,如何解决?(操作系统-ubuntu)
    • ¥20 matlab yalmip kkt 双层优化问题
    • ¥15 如何在3D高斯飞溅的渲染的场景中获得一个可控的旋转物体
    • ¥88 实在没有想法,需要个思路
    • ¥15 MATLAB报错输入参数太多