kewenwu123 2021-11-11 20:33 采纳率: 100%
浏览 122
已结题

爬虫post请求返回403,没法爬数据!

https://www.kickstarter.com/projects/northstargames/paint-the-roses-a-cooperative-puzzle-game-in-wonderland/
需要爬上述网站上的评论
点开评论后跳转
https://www.kickstarter.com/projects/northstargames/paint-the-roses-a-cooperative-puzzle-game-in-wonderland/comments
然后使用抓包工具找到评论json文件graph

img

print(resp)

使用代码

url='https://www.kickstarter.com/graph'
resp=requests.post(url)

print(resp)

返回403
但是图片上是200
意思是我没法获取json文件 求各位帮助

  • 写回答

2条回答 默认 最新

  • CSDN专家-showbo 2021-11-11 21:24
    关注

    headers要添加cookie信息,看题主的代码加错内容(set-cookie是设置cookie用的)了,加到响应头的了,应该将请求头的cookie信息加上

    改下面这样就可以了

    img

    import requests
    import json
    url='https://www.kickstarter.com/graph'
    headers={
        'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36',
        'x-csrf-token':'1UKwcE98l6tohNtp91YX1D7NMwjveSdY9hbI73ZJ3G1ZcyYGFqhdpJor6eYaj9ufOlYcZ_mR9__7Q8Bof64NkA',
        'referer':'https://www.kickstarter.com/projects/northstargames/paint-the-roses-a-cooperative-puzzle-game-in-wonderland/comments',
        'cookie':'vis=cf5a09940159cce5-3e81cd31b25114d2-502cc178dafc2b35v1; _pxvid=4f662ccf-1f93-11ec-8217-50705454435a; ajs_anonymous_id=%22cf5a09940159cce5-3e81cd31b25114d2-502cc178dafc2b35v1%22; _ga=GA1.2.1017090578.1632747879; __ssid=5c597397fdb43a0176a0b530424f8de; lang=zh; last_page=https%3A%2F%2Fwww.kickstarter.com%2Fprojects%2Fnorthstargames%2Fpaint-the-roses-a-cooperative-puzzle-game-in-wonderland; woe_id=R29kNklTd2t0KzhqYUV4Q0hEME16Zz09LS1GRGpEK1dLRjJCMTdIUEdWc2hJL2NBPT0%3D--ac6b290114f1815d012151a90a758b25ab94ca12; optimizely_current_variations=%7B%7D; _pxhd=AC8/XQyI8ZPb1Muwr6d88ZhU-q/d/dPBMJDDqFXTxGPapgEmZKea3pbzn1MK-lqTfhRh5A/YIPdeHg91d4fYBg==:LbVhGdska2coxQeqJpW4Tt-FX3qdPJz2B9XmIPfONbI5gpp6/w1-9w7d8m6RkrT7mFQftC0VYX/pHIvk7gUtuXScrdi2lHGFs-rvQ6cYkGs=; pxcts=cc18c320-42ed-11ec-aebe-45c40d3a0f33; local_offset=-56668; _gid=GA1.2.114541163.1636634973; _gat_creatorAnalytics=1; _gat=1; _px2=eyJ1IjoiY2JmNDRiMzAtNDJlZC0xMWVjLWJiM2UtNzk2YjYyN2RkMGMxIiwidiI6IjRmNjYyY2NmLTFmOTMtMTFlYy04MjE3LTUwNzA1NDU0NDM1YSIsInQiOjE2MzY2MzUyNzUxMzQsImgiOiI0MGUyOWMwOGM1ZjdmOGVkYTY1NzA1MTM1NmE1NmM0MDI2YjVkOGI4Y2UyNGEyOTQxMGVlZTFhNGJmYWY2OGI0In0=; _ksr_session=RHFHeHVxejNnWnNTYm96YndVajA5d1RyM1RkTHV0ZmZPQU1PajhhVzJqRzRiSzgzZEN0YWhva0Y3SUtRUWg3NlNqcEswbzlpeFYrN3ljaElSWEcyUUEvWEVoQm5HVy9ZVlRXOGVBWi95OGdBWm5Zb1ZqaWkweDU3cWVLaXl1Sm1LSzFvaDNycm12R0svcmV0bk9FS0ZnPT0tLTRIRXJlN2xqUFQ5ZzlLZU5YVzVYY2c9PQ%3D%3D--76fb9f1364afd84fd05ef22d68a732d9b0786135; request_time=Thu%2C+11+Nov+2021+12%3A50%3A03+-0000',
        'content-type':'application/json'
        }
    payload=[
        {"operationName":None,
         "variables":{"commentableId":"UHJvamVjdC0xNTA2NzExNDgy","nextCursor":None,"previousCursor":None,"replyCursor":None,"first":25,"last":None},
          "query":"query ($commentableId: ID!, $nextCursor: String, $previousCursor: String, $replyCursor: String, $first: Int, $last: Int) {\n  commentable: node(id: $commentableId) {\n    id\n    ... on Project {\n      url\n      __typename\n    }\n    ... on Commentable {\n      canComment\n      commentsCount\n      projectRelayId\n      canUserRequestUpdate\n      comments(first: $first, last: $last, after: $nextCursor, before: $previousCursor) {\n        edges {\n          node {\n            ...CommentInfo\n            ...CommentReplies\n            __typename\n          }\n          __typename\n        }\n        pageInfo {\n          startCursor\n          hasNextPage\n          hasPreviousPage\n          endCursor\n          __typename\n        }\n        __typename\n      }\n      __typename\n    }\n    __typename\n  }\n  me {\n    id\n    name\n    imageUrl(width: 200)\n    isKsrAdmin\n    url\n    __typename\n  }\n}\n\nfragment CommentInfo on Comment {\n  id\n  body\n  createdAt\n  parentId\n  author {\n    id\n    imageUrl(width: 200)\n    name\n    url\n    __typename\n  }\n  authorBadges\n  canReport\n  canDelete\n  hasFlaggings\n  deletedAuthor\n  deleted\n  authorCanceledPledge\n  __typename\n}\n\nfragment CommentReplies on Comment {\n  replies(last: 3, before: $replyCursor) {\n    totalCount\n    nodes {\n      ...CommentInfo\n      __typename\n    }\n    pageInfo {\n      startCursor\n      hasPreviousPage\n      __typename\n    }\n    __typename\n  }\n  __typename\n}\n"}]
    
    resp=requests.post(url,headers=headers,data=json.dumps(payload))
    
    with open('result.txt','w', encoding='utf-8')as f:
        f.writelines(resp.text)
    
    print('写入文件成功')
    

    有帮助或启发麻烦点下【采纳该答案】,谢谢~~

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

问题事件

  • 系统已结题 11月19日
  • 已采纳回答 11月11日
  • 修改了问题 11月11日
  • 创建了问题 11月11日

悬赏问题

  • ¥20 sub地址DHCP问题
  • ¥15 delta降尺度计算的一些细节,有偿
  • ¥15 Arduino红外遥控代码有问题
  • ¥15 数值计算离散正交多项式
  • ¥30 数值计算均差系数编程
  • ¥15 redis-full-check比较 两个集群的数据出错
  • ¥15 Matlab编程问题
  • ¥15 训练的多模态特征融合模型准确度很低怎么办
  • ¥15 kylin启动报错log4j类冲突
  • ¥15 超声波模块测距控制点灯,灯的闪烁很不稳定,经过调试发现测的距离偏大