extrafan 2021-01-27 15:43 采纳率: 0%
浏览 121

【急】使用requests库爬取网站数据结果得到空值

使用requests库爬取公司网站的数据,用Fiddler观察正常获取数据和使用requests库爬取数据的Post请求,没看出有任何区别,可是却爬不到任何数据。

正常网页获取数据抓取到的Post请求如下:

POST http://10.245.0.225/REPORT_FM/MainController.do?method=queryDatas&type=SF&startTime=2021-01-26%2015:00:00&endTime=2021-01-27%2015:00:00&selectType=null HTTP/1.1
Host: 10.245.0.225
Connection: keep-alive
Content-Length: 390
Accept: */*
X-Requested-With: XMLHttpRequest
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36
__REQUEST_TYPE: AJAX_REQUEST
Origin: http://10.245.0.225
Referer: http://10.245.0.225/REPORT_FM/base/outStatistics/statisticsSF.jsp?globalUniqueID=D3AFCA5F9D40478F9EC64D28D1C26A40
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8
Cookie: JSESSIONID=4D3C9ABEE95EE1860D281DDF7C8FE1CA

dtGridPager=%7B%22isExport%22%3Afalse%2C%22pageSize%22%3A300%2C%22startRecord%22%3A0%2C%22nowPage%22%3A1%2C%22recordCount%22%3A-1%2C%22pageCount%22%3A-1%2C%22parameters%22%3A%7B%22startTime%22%3A%222021-01-21+00%3A00%3A00%22%2C%22endTime%22%3A%222021-01-22+00%3A00%3A00%22%7D%2C%22fastQueryParameters%22%3A%7B%7D%2C%22advanceQueryConditions%22%3A%5B%5D%2C%22advanceQuerySorts%22%3A%5B%5D%7D

使用代码爬取网页抓取到的Post请求如下:

POST http://10.245.0.225/REPORT_FM/MainController.do?method=queryDatas&type=SF&startTime=2021-01-26%2011:00:00&endTime=2021-01-27%2011:00:00&selectType=null HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
__REQUEST_TYPE: AJAX_REQUEST
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8
Cookie: JSESSIONID=6913FCE39C0D7C01D83904E1C5A2EA2F
Host: 10.245.0.225
Origin: http://10.245.0.225
Referer: http://10.245.0.225/REPORT_FM/base/outStatistics/statisticsSF.jsp?globalUniqueID=DD79A370C1FB4F23A8744E40DB12B700
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Content-Length: 390

dtGridPager=%7B%22isExport%22%3Afalse%2C%22pageSize%22%3A300%2C%22startRecord%22%3A0%2C%22nowPage%22%3A1%2C%22recordCount%22%3A-1%2C%22pageCount%22%3A-1%2C%22parameters%22%3A%7B%22startTime%22%3A%222021-01-21+00%3A00%3A00%22%2C%22endTime%22%3A%222021-01-22+00%3A00%3A00%22%7D%2C%22fastQueryParameters%22%3A%7B%7D%2C%22advanceQueryConditions%22%3A%5B%5D%2C%22advanceQuerySorts%22%3A%5B%5D%7D

可是得到的结果却截然相反,从网站正常获取有结果(Content-Length:6337): 

 

使用代码却没有任何结果(Content-Length:0)

 

 请各位专家帮忙解决,万分感谢!

  • 写回答

2条回答 默认 最新

  • extrafan 2021-01-27 15:47
    关注

    另:两次操作的条件均一致,除了Cookie和动态变量globalUniqueId不一样以外,其余内容我反复比较过,均一致,构造请求方面我觉得应该没有问题了,可就是没有任何数据输出,愁啊!

    评论

报告相同问题?

悬赏问题

  • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集
  • ¥15 lammps拉伸应力应变曲线分析
  • ¥15 C++ 头文件/宏冲突问题解决
  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
  • ¥15 请问Lammps做复合材料拉伸模拟,应力应变曲线问题
  • ¥30 python代码,帮调试,帮帮忙吧
  • ¥15 #MATLAB仿真#车辆换道路径规划
  • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建