使用python爬虫制作的简易网页采集器,无法返回有效内容,还是乱码
python爬虫源码:
import requests
url = "https://www.baidu.com/s?"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"
}
kw = input('请输入您想查找的内容:')
password = {
'wb': kw
}
resp = requests.get(url, params=password, headers=headers)
with open(kw+".html", "w", encoding="utf-8") as fp:
fp.write(resp.text)
print("ok")
爬虫返回的内容是:
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="utf-8">
<title>ç¾åº¦å®å
¨éªè¯</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="black">
<meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, minimum-scale=1.0, maximum-scale=1.0">
<meta name="format-detection" content="telephone=no, email=no">
<link rel="shortcut icon" href="https://www.baidu.com/favicon.ico" type="image/x-icon">
<link rel="icon" sizes="any" mask href="https://www.baidu.com/img/baidu.svg">
<meta http-equiv="X-UA-Compatible" content="IE=Edge">
<meta http-equiv="Content-Security-Policy" content="upgrade-insecure-requests">
<link rel="stylesheet" href="https://ppui-static-wap.cdn.bcebos.com/static/touch/css/api/mkdjump_aac6df1.css" />
</head>
<body>
<div class="timeout hide-callback">
<div class="timeout-img"></div>
<div class="timeout-title">ç½ç»ä¸ç»åï¼è¯·ç¨åéè¯</div>
<button type="button" class="timeout-button">è¿åé¦é¡µ</button>
</div>
<div class="timeout-feedback hide-callback">
<div class="timeout-feedback-icon"></div>
<p class="timeout-feedback-title">é®é¢åé¦</p>
</div>
<script src="https://ppui-static-wap.cdn.bcebos.com/static/touch/js/mkdjump_v2_2d634b8.js"></script>
</body>
</html>
这种情况怎么解决!谢谢