稳稳C9 2020-12-15 11:12 采纳率: 33.3%
浏览 100
已采纳

scrapy中请求携带json与request有什么区别?【赏】

 

上面两幅图,第二幅为request版本,一切正常能够得到数据,第一幅为scrapy版本

目前百度已参考:(实际百度了很多)

http://www.cocoachina.com/articles/69939

https://www.cnblogs.com/qiaoer1993/p/10802735.html

https://www.v2ex.com/t/533939

集中方法在,请求头加json那个,第二就是什么body,method指定,都试过了,奇怪。

究竟有什么区别?怎么才能改正确?

 

request代码如下

import scrapy
import json
import requests


class BxwSpiderSpider(scrapy.Spider):
    name = 'bxw_spider'

    api_headers = {
        'Host': 'mpapi.baixing.com',
        'Connection': 'keep-alive',
        'Content-Length': '24',
        'BAIXING-SESSION': '$2y$10$iYbdcOD0tqZQWK1ITZc6PuIMfVDUsxItUQwepiF1VyC00ti24fPcG',
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36 MicroMessenger/7.0.9.501 NetType/WIFI MiniProgramEnv/Windows WindowsWechat',
        'content-type': 'application/json',
        'env_version': '7.0.9',
        'model': 'microsoft',
        'network_type': 'wifi',
        'os': 'Windows',
        'os_version': '10',
        'source': '70',
        'source_params': '',
        'source_path': '',
        'template_version': 'Ver1.3.6',
        'track_id': '1607997570581-6558221-0b293e26ae2048-15816961',
        'udid': 'a382863f-92eb-45aa-a2b0-bca844ca6dd9',
        'Referer': 'https://servicewechat.com/wxd9808e2433a403ab/42/page-frame.html',
        'Accept-Encoding': 'gzip, deflate, br',
    }

    url = 'https://mpapi.baixing.com/v1.3.6/'  # API接口

    def start_requests(self):
        index_json = '{"listing.getAds": {"areaId": "m28", "categoryId": "gongzuo", "page": 1}}'  # 2 3

        # yield scrapy.Request(
        #     url=self.url,
        #     headers=self.api_headers,
        #     method='POST',
        #     body=index_json,
        #     callback=self.parse,
        #     dont_filter=True)

        yield scrapy.FormRequest(
            url=self.url,
            headers=self.api_headers,
            formdata=eval(index_json),
            callback=self.parse,
            dont_filter=True)

    def parse(self, response):
        print('程序进入')
        res_json = json.dumps(response.text)
        print(res_json)

scrapy代码如下:

import scrapy
import json
import requests


class BxwSpiderSpider(scrapy.Spider):
    name = 'bxw_spider'

    api_headers = {
        'Host': 'mpapi.baixing.com',
        'Connection': 'keep-alive',
        'Content-Length': '24',
        'BAIXING-SESSION': '$2y$10$iYbdcOD0tqZQWK1ITZc6PuIMfVDUsxItUQwepiF1VyC00ti24fPcG',
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36 MicroMessenger/7.0.9.501 NetType/WIFI MiniProgramEnv/Windows WindowsWechat',
        'content-type': 'application/json',
        'env_version': '7.0.9',
        'model': 'microsoft',
        'network_type': 'wifi',
        'os': 'Windows',
        'os_version': '10',
        'source': '70',
        'source_params': '',
        'source_path': '',
        'template_version': 'Ver1.3.6',
        'track_id': '1607997570581-6558221-0b293e26ae2048-15816961',
        'udid': 'a382863f-92eb-45aa-a2b0-bca844ca6dd9',
        'Referer': 'https://servicewechat.com/wxd9808e2433a403ab/42/page-frame.html',
        'Accept-Encoding': 'gzip, deflate, br',
    }

    url = 'https://mpapi.baixing.com/v1.3.6/'  # API接口

    def start_requests(self):
        index_json = '{"listing.getAds": {"areaId": "m28", "categoryId": "gongzuo", "page": 1}}'  # 2 3

        # yield scrapy.Request(
        #     url=self.url,
        #     headers=self.api_headers,
        #     method='POST',
        #     body=index_json,
        #     callback=self.parse,
        #     dont_filter=True)

        yield scrapy.FormRequest(
            url=self.url,
            headers=self.api_headers,
            formdata=eval(index_json),
            callback=self.parse,
            dont_filter=True)

    def parse(self, response):
        print('程序进入')
        res_json = json.dumps(response.text)
        print(res_json)

之前爬另外一个网站,也是POST种携带请求参数,request版本就能成功,scrapy就是不行,是我使用姿势不对?

非常疑惑,百度过很多了!!!

  • 写回答

11条回答 默认 最新

  • 放风喽 2020-12-16 13:32
    关注

    兄弟,半个小时的辛苦

    class CeshiSpider(scrapy.Spider):
        name = 'ceshi'
        api_headers = {
            'Host': 'mpapi.baixing.com',
            'BAIXING-SESSION': '$2y$10$iYbdcOD0tqZQWK1ITZc6PuIMfVDUsxItUQwepiF1VyC00ti24fPcG',
            'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36 MicroMessenger/7.0.9.501 NetType/WIFI MiniProgramEnv/Windows WindowsWechat',
            # 'content-type': 'application/json',
            # 'Content-Length': '24',
            'Referer': 'https://servicewechat.com/wxd9808e2433a403ab/42/page-frame.html',
        }
    
        url = 'https://mpapi.baixing.com/v1.3.6/'  # API接口
        def start_requests(self):
            index_json = {"listing.getAds": {"areaId": "m28", "categoryId": "gongzuo", "page": 1}}  # 2 3
    
            yield scrapy.Request(
                url=self.url,
                method="POST",
                headers=self.api_headers,
                body=json.dumps(index_json),
                callback=self.parse,
                dont_filter=True)
    
        def parse(self, response):
            print("下面是结果")
            print(response.text)
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(10条)

报告相同问题?

悬赏问题

  • ¥15 腾讯云如何建立同一个项目中物模型之间的联系
  • ¥30 VMware 云桌面水印如何添加
  • ¥15 用ns3仿真出5G核心网网元
  • ¥15 matlab答疑 关于海上风电的爬坡事件检测
  • ¥88 python部署量化回测异常问题
  • ¥30 酬劳2w元求合作写文章
  • ¥15 在现有系统基础上增加功能
  • ¥15 远程桌面文档内容复制粘贴,格式会变化
  • ¥15 这种微信登录授权 谁可以做啊
  • ¥15 请问我该如何添加自己的数据去运行蚁群算法代码