dtjo87679 2017-10-27 05:20
浏览 18
已采纳

表单提交多重重定向

I'm trying to fetch data from a website where once you submit the form it redirects to a loading page which is set to be automatically redirected to the final results page. The issue is that the crawler only gets the data of the loading page and does not go fully to the final results page which is what I actually need. Can someone please tell me how I can achieve that? If not possible then what could be an alternative way to do this?

  • 写回答

1条回答 默认 最新

  • dongqian1925 2017-10-27 05:34
    关注

    If you're using curl, you can try the following:

    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

    If you still aren't getting past the loading page, its possible its not an http redirect.

    In that case you'll have to manually parse the target location. A lot of websites use a meta refresh tag for such loading pages. Look for something similar to the following:

    <meta http-equiv="refresh" content="5; url=http://example.com/" />

    You can easily parse the above with regex or any dom parsing library for php.

    Another possibility is a javascript redirect. Look for lines containing window.location in the source code.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题