我的代码
# -*- coding:utf-8 -*-
from scrapy.spiders import CrawlSpider
from scrapy.selector import Selector
class TestSrc(CrawlSpider):
name = "testSrcapy"
start_urls = ['https://zhidao.baidu.com/question/1993068880203051627.html']
def parse(self, response):
selector = Selector(response)
UrlData = selector.xpath('//html/body/div[7]/div/section/article/div[1]/h1/span/text()').extract()
print(UrlData)
settings.py已经设置了USER_AGENT 和 ROBOTSTXT_OBEY
我只想把标题的文本尝试打印出来。也试过其他网站也不行(教程的豆瓣却百试百灵)
xpath路径是firefox中复制下来的