使用xpath循环取post_nodes 的子节点post_node 数据时,一直取的是第一个节点数据,为什么?要怎么修改才正常?
class XpathSpider(scrapy.Spider):
name = 'xpath'
allowed_domains = ['news.cnblogs.com']
start_urls = ['http://news.cnblogs.com/']
def parse(self, response):
post_nodes = response.xpath('//div[@id="news_list"]/div[@class="news_block"]')
for post_node in post_nodes:
image_url = post_node.xpath('//div[@class="entry_summary"]/a/img/@src').extract_first("")
post_url = post_node.xpath('//h2/a/@href').extract_first("")
print(image_url)
print(post_url)
打印结果: