有没有办法用php file_get_contents绕过403错误？

I'm trying to get a specific webpage using php file_get_contents - when I view the page directly there is no problem but when trying to grab it using php I get "failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden". Theres a piece of data that I'm trying to extract from the page.

$ft = file_get_contents('https://www.vesselfinder.com/vessels/CELEBRITY-MILLENNIUM-IMO-9189419-MMSI-249055000');

echo $ft;

I've read up on various pages here about using stream_context_create, mainly the user agent part

$context  = stream_context_create(
array(
    "http" => array(
        "header" => "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"
    )
)

);

But nothing works and I now get a 400 error message. Unfortunately it doesn't look like my server is configured to use cURL so file_get_contents seems to be the only way for me to do this.

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duanbo6482 2017-12-02 19:49
关注
You need to add the User-Agent header to the actual header:

$context = stream_context_create( array( 'http' => array( 'header' => 'User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36', ), ));

You could also use the user_agent option:

$context = stream_context_create( array( 'http' => array( 'user_agent' => 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36', ), ));

Both above examples should work and you should now be able to get the contents using:

$content = file_get_contents('https://www.vesselfinder.com/vessels/CELEBRITY-MILLENNIUM-IMO-9189419-MMSI-249055000', false, $context); echo $content;

This could of course also be tested using curl from the command line. Notice that we are setting our own User-Agent header:

curl --verbose -H 'User-Agent: YourApplication/1.0' 'https://www.vesselfinder.com/vessels/CELEBRITY-MILLENNIUM-IMO-9189419-MMSI-249055000'

It might also be worth knowing that the default User-Agent used by curl seems to be blocked, so if using curl you need to add your own using the -H flag.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

有没有办法用php file_get_contents绕过403错误？

2条回答 默认 最新

2条回答默认最新