dsgwoh7038 2015-03-23 22:37
浏览 15
已采纳

如何避免某些网站拒绝HTTP使用Go

We have a script that on a daily basis checks all of the web links in all of our database records (the users want notifications when a link becomes out of date).

There are a couple of sites that work fine through a web browser from this IP address, but when fetched through GO, they either disconnect before completing the request or return a HTTP authorisation denied message.

I am assuming some sort of firewall (F5) is filtering/blocking the request. This occurs even when I change the HTTP request to use a common user agent. What can we do to ensure a GO request looks like a standard browser?

func fetch_url(url string, d time.Duration) (int, error) {

    client := &http.Client{
        Timeout: d,
    }

    req, err := http.NewRequest("GET", url, nil)
    if err != nil {
        return 0, err
    }

    req.Header.Set("User-Agent", "Mozilla/5.0 (iPad; CPU OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53")

    resp, err := client.Do(req)
    if err != nil {
        return 0, err
    }

    status := resp.StatusCode
    resp.Body.Close()
    return status, nil
}
  • 写回答

1条回答 默认 最新

  • dou6495 2015-03-24 01:17
    关注

    Try matching the exact headers from a request from your web browser to eliminate other factors. A smart firewall could have heuristics on what looks like a web browser versus a robot.

    Notice that the go http client sends only a minimal HTTP request:

    GET /foo HTTP/1.1
    Host: localhost:3030
    User-Agent: Go 1.1 package http
    Accept-Encoding: gzip
    

    Whereas a web browser is more chatty:

    GET /foo HTTP/1.1
    Host: localhost:3030
    Connection: keep-alive
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.89 Safari/537.36
    Accept-Encoding: gzip, deflate, sdch
    Accept-Language: en-US,en;q=0.8
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥17 pro*C预编译“闪回查询”报错SCN不能识别
  • ¥15 微信会员卡接入微信支付商户号收款
  • ¥15 如何获取烟草零售终端数据
  • ¥15 数学建模招标中位数问题
  • ¥15 phython路径名过长报错 不知道什么问题
  • ¥15 深度学习中模型转换该怎么实现
  • ¥15 HLs设计手写数字识别程序编译通不过
  • ¥15 Stata外部命令安装问题求帮助!
  • ¥15 从键盘随机输入A-H中的一串字符串,用七段数码管方法进行绘制。提交代码及运行截图。
  • ¥15 TYPCE母转母,插入认方向