我在爬取猪八戒网站时候。text一直返回的是空集,这是为什么呀?
import requests
from lxml import etree
import time
import random
url = "https://www.zbj.com/fw/?k=爬虫"
resp = requests.get(url)
resp_text = resp.text
etree = etree.HTML(resp_text)
divs = etree.xpath('//*[@id="__layout"]/div/div[3]/div[1]/div[4]/div/div[2]/div/div[2]/div')
for div in divs:
price = div.xpath('./div/div[3]/div[1]/span/text()')[0].strip("¥")
if not price:
continue
company = div.xpath('./div/div[5]/div[1]/div[1]/div/text()')[0]
text = div.xpath('../div/div[3]/div[2]/div/span//text()')
print(company, price, text)
time.sleep(random.uniform(1, 3))
返回:
八戒软件开发服务 880 []
畅序丨包售后按需定制满意付款 100 []