dream0776 2018-06-12 18:20
浏览 188

如何使用PHP避免503错误抓取instagram?

I am working on a website and am looking to retrieve the 6 most recent photos from an instagram account and make the caption the hover text. My code below works, but what I notice is that after I refresh the page once or twice, I start getting this error:

Warning: file_get_contents(http://instagram.com/green_tree_relief): failed to open stream: HTTP request failed! HTTP/1.1 503 No server is available for the request on line 14

I assume this is instagram blocking me for my behavior?

I am looking for a way around this, if possible. I tried to spoof a user agent and that didn't work. If I am not misunderstanding their API deprecation, they won't allow you to use it to get even public content. I can see this as an ethically gray area I suppose, so abandoning this functionality altogether is an option, but I would like to get this to work.

Anyways, on my main page there is an ajax call to this PHP script which then inserts the generated HTML upon success:

function scrape_insta($username) {

    $insta_source = file_get_contents('http://instagram.com/'.$username);
    $shards = explode('window._sharedData = ', $insta_source);
    $insta_json = explode(';</script>', $shards[1]);
    $insta_array = json_decode($insta_json[0], TRUE);
    return $insta_array;
}

$my_account = 'green_tree_relief';

$photostreamHTML = '';

$results_array = scrape_insta($my_account);

for($i = 0; $i < 6; $i++) {

    if(isset($results_array['entry_data']['ProfilePage'][0]['graphql']['user']['edge_owner_to_timeline_media']['edges'][$i]['node']['edge_media_to_caption']['edges'][0]['node']['text'])) {
        $caption = $results_array['entry_data']['ProfilePage'][0]['graphql']['user']['edge_owner_to_timeline_media']['edges'][$i]['node']['edge_media_to_caption']['edges'][0]['node']['text'];
    }

    if(isset($results_array['entry_data']['ProfilePage'][0]['graphql']['user']['edge_owner_to_timeline_media']['edges'][$i]['node']['display_url'])) {
        $photostreamHTML .= '<div style="height: 84px;">
                            <a href="https://www.instagram.com/' . $my_account . '" target="_blank">
                                <img src="'
            . $results_array['entry_data']['ProfilePage'][0]['graphql']['user']['edge_owner_to_timeline_media']['edges'][$i]['node']['display_url'] . '"
                                class ="img-responsive" title = "' . $caption . '">
                            </a>
                        </div>';
    }

}

As I said, it works the first couple of times I load the page when I haven't been working on the site for a few hours but then it fails after that.

Any suggestions would be appreciated.

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥100 set_link_state
    • ¥15 虚幻5 UE美术毛发渲染
    • ¥15 CVRP 图论 物流运输优化
    • ¥15 Tableau online 嵌入ppt失败
    • ¥100 支付宝网页转账系统不识别账号
    • ¥15 基于单片机的靶位控制系统
    • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
    • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
    • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
    • ¥15 手机接入宽带网线,如何释放宽带全部速度