如何使用selenium在已爬取这页文件链接的同时翻页并一起爬取下页文件链接


如何使用selenium在已爬取这页文件链接的同时翻页并一起爬取下页文件链接


zxlcaiyuangungun 下午好🌅🌅🌅
本答案参考ChatGPT-3.5
要使用Selenium在已爬取当前页的文件链接的同时翻页并一起爬取下一页的文件链接,你可以遵循以下步骤:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()
driver.get(url)
import time
time.sleep(5)
from lxml import etree
page_source = driver.page_source
html = etree.HTML(page_source)
links = html.xpath('//td[@class="first-td"]//@href')
print(links)
# 点击下一页按钮
next_button = driver.find_element_by_xpath('xpath_of_next_button')
next_button.click()
# 或者使用键盘操作
body = driver.find_element_by_tag_name('body')
body.send_keys(Keys.PAGE_DOWN)
整个过程的代码示例如下所示:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from lxml import etree
import time
url = "https://www.dg.gov.cn/gkmlpt/index#694"
driver = webdriver.Chrome()
driver.get(url)
time.sleep(5)
while True:
page_source = driver.page_source
html = etree.HTML(page_source)
links = html.xpath('//td[@class="first-td"]//@href')
print(links)
next_button = driver.find_element_by_xpath('xpath_of_next_button')
next_button.click()
time.sleep(5)
请注意,你需要根据实际情况修改代码中的XPath表达式来定位下一页按钮。
希望能对你有所帮助!