dream752614590 2016-12-14 11:02
浏览 154
已采纳

从外部页面链接获取“标题”和“描述”

I am trying to get title, description from external page link source. This is not working when I am trying to get Facebook page source and is returning source code of some another page. It is working on other websites like google etc. Here is my code in PHP :

$ch = curl_init();
   curl_setopt($ch, CURLOPT_HEADER, 0);
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
   curl_setopt($ch, CURLOPT_URL, $url);
   curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
   $data = curl_exec($ch);
   curl_close($ch);
   return $data;
}

public function previewLink(){
   $url = "https://www.facebook.com/NASA/";
   $html = $this->file_get_contents_curl($url);
   $title = "";
   $description ="";
   $image = "";

   //parsing begins here:
   $doc = new \DOMDocument();
   @$doc->loadHTML($html);
   $nodes = $doc->getElementsByTagName('title');
   $title = $nodes->item(0)->nodeValue();
  }

I am not getting what is the problem I am facing. Can someone suggest something ? Thanks in advance.

  • 写回答

1条回答 默认 最新

  • dongshi2836 2016-12-14 11:45
    关注

    Facebook requires UserAgent string in http request. You can add that by using this

    curl_setopt($ch, CURLOPT_HTTPHEADER, array('User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/600.7.12 (KHTML, like Gecko) Version/8.0.7 Safari/600.7.12'));
    

    FYI: facebook uses to display captcha page when anyone goes to a page without login.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 c++头文件不能识别CDialog
  • ¥15 Excel发现不可读取的内容
  • ¥15 UE5#if WITH_EDITOR导致打包的功能不可用
  • ¥15 关于#stm32#的问题:CANOpen的PDO同步传输问题
  • ¥20 yolov5自定义Prune报错,如何解决?
  • ¥15 电磁场的matlab仿真
  • ¥15 mars2d在vue3中的引入问题
  • ¥50 h5唤醒支付宝并跳转至向小荷包转账界面
  • ¥15 算法题:数的划分,用记忆化DFS做WA求调
  • ¥15 chatglm-6b应用到django项目中,模型加载失败