weixin_44740202 2021-03-03 00:40 采纳率: 40%
浏览 124

请问下Scrapy加代理后显示got NoneType的错误

先放下报错信息

# 网站隐藏的以下.请求的url肯定是没出问题

2021-03-03 00:31:22 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.xxx.com/>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scrapy/core/downloader/middleware.py", line 45, in process_request
    return (yield download_func(request=request, spider=spider))
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scrapy/utils/defer.py", line 55, in mustbe_deferred
    result = f(*args, **kw)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/__init__.py", line 75, in download_request
    return handler.download_request(request, spider)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http11.py", line 88, in download_request
    return agent.download_request(request)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http11.py", line 342, in download_request
    agent = self._get_agent(request, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scrapy/core/downloader/handlers/http11.py", line 301, in _get_agent
    _, _, proxyHost, proxyPort, proxyParams = _parse(proxy)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scrapy/core/downloader/webclient.py", line 36, in _parse
    return _parsed_url_args(parsed)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scrapy/core/downloader/webclient.py", line 19, in _parsed_url_args
    host = to_bytes(parsed.hostname, encoding="ascii")
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scrapy/utils/python.py", line 106, in to_bytes
    raise TypeError('to_bytes must receive a str or bytes '
TypeError: to_bytes must receive a str or bytes object, got NoneType
(base) licongjian@licongjiandeMacBook-Pro jingdongPro $ 

 

手动添加代理的时候可以获取到数据.后来在redis的集合中拿到代理以后就出现了这个问题

def process_request(self, request, spider):
    proxy = str(self.redis_db.srandmember('proxy')).replace('b', '')
    request.meta['proxy'] = proxy

此为打印的代理信息与request.meta
{'download_timeout': 3.0, 'proxy': "'https://116.115.210.140:4326'"}
'https://116.115.210.140:4326' <class 'str'>



  • 写回答

1条回答 默认 最新

  • Marst Code 2023-06-12 14:50
    关注

    这是取值问题. redis默认返回的值是bytes类型的.
    解决方法: 连接redis时, 添加参数decode_responses=True
    例子如下:
    redis = redis.Redis(host=RedisConfig.redis_host, port=RedisConfig.redis_port, db=0, decode_responses=True)

    评论

报告相同问题?

悬赏问题

  • ¥15 ue5 .3之前好好的现在只要是激活关卡就会崩溃
  • ¥50 MATLAB实现圆柱体容器内球形颗粒堆积
  • ¥15 python如何将动态的多个子列表,拼接后进行集合的交集
  • ¥20 vitis-ai量化基于pytorch框架下的yolov5模型
  • ¥15 如何实现H5在QQ平台上的二次分享卡片效果?
  • ¥15 python爬取bilibili校园招聘网站
  • ¥30 求解达问题(有红包)
  • ¥15 请解包一个pak文件
  • ¥15 不同系统编译兼容问题
  • ¥100 三相直流充电模块对数字电源芯片在物理上它必须具备哪些功能和性能?