已经采用docker下载完splash
然后也输入了 docker run -p 8050:8050 scrapinghub/spalsh 并看到成功提示信息
其中项目的setting.py相关配置:
# splash 服务器地址
SPLASH_URL = 'http://localhost:8050'
# 开启splash的两个下载中间件, 并调整HttpCompressionMiddleware的次序
DOWNLOAD_MIDDLEWARES = {
'scrapy_splash.SplashCookiesMiddleware': 723,
'scrapy_splash.SplashMiddleware': 725,
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware': 810,
}
# 设置去重过滤器
DUPEFIlTER_CLASS = 'scrapy_splash.SplashAwareDupeFilter'
# 用来支持cache_args(可选)
SPIDER_MIDDLEWARES = {
'scrapy_splash.SplashDeduplicateArgsMiddleware': 100,
}
之后在scrapy shell中进行调试:
还是爬取不到数据,自己也不知道错在了哪里,真诚希望能够得到指导,非常感谢!