问题遇到的现象和发生背景
编译器是pycharm,环境是anaconda3,解释器是python3.8
代码的作用是获取给某好友qq空间点赞的账户信息
在最后一行print报错
问题相关代码,请勿粘贴截图
#preparing
import requests as re
from lxml import etree
headers = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.41 Safari/537.36'}
url = "https://user.qzone.qq.com/"
#get the QQ number
fp = open("source/qqnumber.txt")
qn = fp.readline(11)
fp.close
#get the QQ cookie
fp = open("source/qqcookie.txt")
headers['cookie'] = fp.readline()
#update the string
url = url + qn
print(url)
#start
res = re.get(url='https://user.qzone.qq.com/proxy/domain/ic2.qzone.qq.com/cgi-bin/feeds/feeds_html_module?g_iframeUser=1&i_uin=2470851837&i_login_uin=1498164408&mode=4&previewV8=1&style=35&version=8&needDelOpr=true&transparence=true&hideExtend=false&showcount=5&MORE_FEEDS_CGI=http%3A%2F%2Fic2.s6.qzone.qq.com%2Fcgi-bin%2Ffeeds%2Ffeeds_html_act_all&refer=2¶mstring=os-win7|100',headers=headers)
result = res.text.encode('utf-8').decode('utf-8')
fp =open('lloutput/output.txt','w+',encoding='utf-8')
fp.write(result)
#print(res.text)
fp.close()
html = etree.HTML(result)
print(html.xpath('//*[@class="user-list"]/text()'))
运行结果及报错内容
问题出现在最后一行,错误是“UnicodeEncodeError: 'gbk' codec can't encode character '\xe3' in position 2: illegal multibyte sequence”
我的解答思路和尝试过的方法
后面在pycharm里把编码改成了utf-8,不报错了,但是打印出来的是乱码
我想要达到的结果
乱码变成正常的文字,以完成接下来的操作