问题遇到的现象和发生背景
最近了解到微软playwright,去尝试了一下,对比selenium少了手动写代码的过程
但是遇到了一个无法解决的问题,官方文档是英文,大部分工作都是截图与录屏,
我需要解决的是,在自动打开页面之后能像requests.get(url,headers=headers).text的过程,方便后续介入Bs4
问题相关代码,请勿粘贴截图
def run(playwright: Playwright) -> None:
browser = playwright.chromium.launch(headless=False)
context = browser.new_context()
page = context.new_page()
page.goto("https://www.baidu.com/")
page.click("input[name=\"wd\"]")
page.fill("input[name=\"wd\"]", "爱企查")
with page.expect_navigation():
page.press("input[name=\"wd\"]", "Enter")
with page.expect_navigation():
with page.expect_popup() as popup_info:
page.click("text=人人都用的企业信息查询平台-爱企查-免费查-专..")
page1 = popup_info.value
page1.click("img[alt=\"关闭\"]")
print(page1.text) # 这里想获取当前页面信息,完成目标
context.close()
browser.close()
# with sync_playwright() as playwright:
# browser = playwright.chromium.launch(headless=False)
# context = browser.new_context()
# create a new page inside context.
if __name__ == '__main__':
with sync_playwright() as playwright:
run(playwright)
运行结果及报错内容
Traceback (most recent call last):
File "D:/playwright/pwtest_1.py", line 42, in <module>
run(playwright)
File "D:/playwright/pwtest_1.py", line 32, in run
print(page1.text)
AttributeError: 'Page' object has no attribute 'text'
我的解答思路和尝试过的方法
尝试过使用:
page1.value;
page1.context;
无法获取当前页面text
我想要达到的结果
希望能够获取到当前页面的text,后续尝试使用beautifulsoup进行2次处理,谢谢!
注: 目标页面仅用于测试