douxian1939 2012-10-30 14:37
浏览 78

使用curl / file_get_contents从facebook获取页面

I try to get content from my facebook page like so:

echo file_get_contents("http://www.facebook.com/dma.y");

The problem is that it doesnt give me the page but redirects me to another page that says that I need to upgrade my browswer. Then I thought to use curl and fetch it by sending a request with some headers.

 echo get_follow_url('http://www.facebook.com/dma.y');
function get_follow_url($url){
        // must set $url first. Duh...
    $http = curl_init($url);
      curl_setopt($http, CURLOPT_RETURNTRANSFER, TRUE); 
       curl_setopt($http, CURLOPT_HTTPHEADER, get_headers('http://google.com'));
    // do your curl thing here
    $result = curl_exec($http);



 if(curl_errno($http)){ 
     echo "<br/>An error has been thrown!<br/>";
    exit(); 
 }
    $http_status = curl_getinfo($http, CURLINFO_HTTP_CODE);
    curl_close($http);
return $http_status;
}

Still there is no luck. I should have a status code response returned which is either 404 or 200.. depending if I am logged into facebook. But it returns 301, cause it identifies my request as not being a regular browser request. so what am I missing in the curl option settings?

UPDATE What I am actually trying to do is to replicate this functionality:

The script will trigger the function onload or onerror, depending on the status code returned..

That code will retrieve the page. However, that javascript method is clumsy, and breaks in some browsers like firefox..cause it isnt a javascript file.

  • 写回答

1条回答 默认 最新

  • doujionggan9570 2012-10-30 14:41
    关注

    What you might want to try is to set the user_agent with CURL.

    $url = 'https://www.facebook.com/cocacola';
    $http = curl_init($url);
    $fake_user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040803 Firefox/0.9.3';
    curl_setopt($http, CURLOPT_USERAGENT, $fake_user_agent); 
    $result = curl_exec($http);
    

    This is the parameter that servers look at to see what browser you are using. I'm not 100% sure if this will bypass Facebook's checks and give you ALL the information on the page, but it's definitely worth a try! :)

    评论

报告相同问题?

悬赏问题

  • ¥15 使用ue5插件narrative时如何切换关卡也保存叙事任务记录
  • ¥20 软件测试决策法疑问求解答
  • ¥15 win11 23H2删除推荐的项目,支持注册表等
  • ¥15 matlab 用yalmip搭建模型,cplex求解,线性化处理的方法
  • ¥15 qt6.6.3 基于百度云的语音识别 不会改
  • ¥15 关于#目标检测#的问题:大概就是类似后台自动检测某下架商品的库存,在他监测到该商品上架并且可以购买的瞬间点击立即购买下单
  • ¥15 神经网络怎么把隐含层变量融合到损失函数中?
  • ¥15 lingo18勾选global solver求解使用的算法
  • ¥15 全部备份安卓app数据包括密码,可以复制到另一手机上运行
  • ¥20 测距传感器数据手册i2c