2013-05-16 07:15
浏览 173


I have an script that grabs content from third party sites and if the url is not found the site redirects with a 302 header location to a custom not found webpage instead of sending a 404 not found. The script also caches the content returned by curl_exec but i don't want to cache the error pages, so is there a way to log those redirects if i have turned on CURLOPT_FOLLOWLOCATION? How can i solve this situation? I know i could just find the error message using a dom parser and if found just discard it, but i want to know if there is other ways to accomplish this.

图片转代码服务由CSDN问答提供 功能建议

我有一个脚本可以抓取第三方网站的内容,如果找不到网址,网站会重定向302 标题位置到自定义未找到的网页,而不是发送未找到的404。 该脚本还缓存curl_exec返回的内容,但我不想缓存错误页面,如果我打开了CURLOPT_FOLLOWLOCATION,有没有办法记录这些重定向? 我该如何解决这种情况? 我知道我可以使用dom解析器找到错误消息,如果发现只是丢弃它,但我想知道是否有其他方法可以实现这一点。

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

3条回答 默认 最新

  • dpoxk64080 2013-05-17 12:50

    I ended up disabling followlocation so i just have to catch the 302 code and if it's present i don't cache the page. Thought there would be a way of catching all codes before curl redirects.

    点赞 评论
  • dtah63820 2013-05-16 07:54

    Have a look at Easy way to test a URL for 404 in PHP?

    Then using that, just do not cache the page if there is a 404

    点赞 评论
  • dprxj1995 2013-05-17 03:11
    点赞 评论

相关推荐 更多相似问题