求大佬解答,整了一个小时都没搞懂,爬的是网站首页的文章标题和url,代码都对,路径也没错,也出来了uuu.txt,但是里面没内容(┯_┯)
网站url代码里有。
代码如下
import requests
from lxml import etree
headers = {
'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36 Edg/96.0.1054.57"
}
url = 'http://xb.xsqk.ccut.edu.cn/gwqk/list.htm'
response = requests.get(url=url, headers=headers)
page_text = response.text
tree = etree.HTML(page_text)
tr_list = tree.xpath('/html/body/div/div[2]/table[3]/tbody/tr/td/table/tbody/tr/td[3]/table[4]/tbody/tr/td/div/div/div[1]/table/tbody/tr')
with open('./uuu.txt', 'w', encoding='utf-8')as fp:
for tr in tr_list:
title = tr.xpath('.//a/@title')[0]
href = 'http://xb.xsqk.ccut.edu.cn/gwqk/list.htm' + tr.xpath('.//a/@href')[0]
fp.write(title + href + '\n')
fp.close()