yangyangk99 2023-05-24 21:45 采纳率: 0%
浏览 82
已结题

Scarpy爬虫问题

Scarpy爬虫报了下面的错
第一个报错
2023-05-24 15:44:49 [scrapy.core.scraper] ERROR: Error downloading <GET https://zh.wikipedia.org/wiki/%E8%A1%8C%E7%A8%8B%3E
Traceback (most recent call last):
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/cache.py", line 190, in run_and_cache
result = self.get(namespace=namespace, key=key_args)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/cache.py", line 93, in get
raise KeyError("namespace: " + namespace + " key: " + repr(key))
KeyError: "namespace: publicsuffix.org-tlds key: {'urls': ('https://publicsuffix.org/list/public_suffix_list.dat', 'https://raw.githubusercontent.com/publicsuffix/list/master/public_suffix_list.dat'), 'fallback_to_snapshot': True}"

第二个报错
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/cache.py", line 190, in run_and_cache
result = self.get(namespace=namespace, key=key_args)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/cache.py", line 93, in get
raise KeyError("namespace: " + namespace + " key: " + repr(key))
KeyError: "namespace: urls key: {'url': 'https://publicsuffix.org/list/public_suffix_list.dat'}"

第三个报错
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/twisted/internet/defer.py", line 1697, in _inlineCallbacks
result = context.run(gen.send, result)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/scrapy/core/downloader/middleware.py", line 64, in process_response
method(request=request, response=response, spider=spider)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/scrapy/downloadermiddlewares/cookies.py", line 73, in process_response
self._process_cookies(cookies, jar=jar, request=request)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/scrapy/downloadermiddlewares/cookies.py", line 44, in _process_cookies
if cookie_domain and _is_public_domain(cookie_domain):
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/scrapy/downloadermiddlewares/cookies.py", line 19, in _is_public_domain
parts = _split_domain(domain)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/tldextract.py", line 233, in call
suffix_index = self._get_tld_extractor().suffix_index(
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/tldextract.py", line 274, in _get_tld_extractor
public_tlds, private_tlds = get_suffix_lists(
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/suffix_list.py", line 55, in get_suffix_lists
return cache.run_and_cache(
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/cache.py", line 192, in run_and_cache
result = func(**kwargs)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/suffix_list.py", line 72, in _get_suffix_lists
text = find_first_response(cache, urls, cache_fetch_timeout=cache_fetch_timeout)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/suffix_list.py", line 30, in find_first_response
return cache.cached_fetch_url(
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/cache.py", line 199, in cached_fetch_url
return self.run_and_cache(
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/cache.py", line 192, in run_and_cache
result = func(**kwargs)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/tldextract/cache.py", line 209, in _fetch_url
response = session.get(url, timeout=timeout)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/requests/sessions.py", line 600, in get
return self.request("GET", url, **kwargs)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/requests/adapters.py", line 455, in send
conn = self.get_connection(request.url, proxies)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/requests/adapters.py", line 352, in get_connection
conn = proxy_manager.connection_from_url(url)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/urllib3/poolmanager.py", line 299, in connection_from_url
return self.connection_from_host(
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/urllib3/poolmanager.py", line 500, in connection_from_host
return super(ProxyManager, self).connection_from_host(
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/urllib3/poolmanager.py", line 246, in connection_from_host
return self.connection_from_context(request_context)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/urllib3/poolmanager.py", line 261, in connection_from_context
return self.connection_from_pool_key(pool_key, request_context=request_context)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/urllib3/poolmanager.py", line 282, in connection_from_pool_key
pool = self._new_pool(scheme, host, port, request_context=request_context)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/urllib3/poolmanager.py", line 214, in _new_pool
return pool_cls(host, port, **request_context)
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/urllib3/connectionpool.py", line 938, in init
HTTPConnectionPool.init(
File "/Users/luhongyang/opt/anaconda3/python.app/Contents/lib/python3.9/site-packages/urllib3/connectionpool.py", line 198, in init
self.pool = self.QueueCls(maxsize)
TypeError: LifoQueue() takes no arguments

报错都非常奇怪 全是库函数里报错 是库的问题吗还是代码的问题呢

  • 写回答

9条回答 默认 最新

  • 成都渔民 2023-05-24 21:56
    关注
    获得0.90元问题酬金

    Error downloading 就是无法下载,你后面的链接是无法访问的,你可以试试把链接拷贝到浏览器里面试试,确实是无法访问的。

    评论

报告相同问题?

问题事件

  • 系统已结题 6月1日
  • 创建了问题 5月24日

悬赏问题

  • ¥15 校内二手商品转让网站
  • ¥20 高德地图聚合图层MarkerCluster聚合多个点,但是ClusterData只有其中部分数据,原因应该是有经纬度重合的地方点,现在我想让ClusterData显示所有点的信息,如何实现?
  • ¥100 求Web版SPC控制图程序包调式
  • ¥20 指导如何跑通以下两个Github代码
  • ¥15 大家知道这个后备文件怎么删吗,为啥这些文件我只看到一份,没有后备呀
  • ¥15 C++为什么这个代码没报错运行不出来啊
  • ¥15 一道ban了很多东西的pyjail题
  • ¥15 关于#r语言#的问题:如何将生成的四幅图排在一起,且对变量的赋值进行更改,让组合的图漂亮、美观@(相关搜索:森林图)
  • ¥15 C++识别堆叠物体异常
  • ¥15 微软硬件驱动认证账号申请