简单的except报错问题，次代码一直报错无法正常获取网址

from bs4 import BeautifulSoup     #网页解析，获取数据
import re       #正则表达式，进行文字匹配
import urllib.request,urllib.error      #制定URL，获取网页数据
import xlwt     #进行excel操作
import sqlite3  #进行SQLite数据库操作

def main():
    baserl = 'https://movie.douban.com/top250?start='
    url1 = getat(baserl)
fike = re.compile(r'<a href="(.*?)">')
def getat(baserl):
    for i in range(0,10):
        url = baserl+str(25*i)
        html = gat(url)
        soup = BeautifulSoup(html,'html.parser')

        for item in soup('div',class_='item'):
            item = str(item)
            save = []
            like = re.findall(fike,item)[0]
            print(like)
def gat(url):

    # global html
    head = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36 Edg/91.0.864.48"
        ''}
    a =urllib.request.Request(url,headers=head)
    html = ''
    try:
        response = urllib.request.urlopen(a)
        html = response.read().docode('utf-8')
    except :
        print('14')
    return html


if __name__ == '__main__':
    main()
    print('爬完')

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除
收藏举报

2条回答默认最新

一只爱编程的书虫 2021-09-20 19:15

关注

分析过程：
使用以下代码，可以追踪错误信息。

from bs4 import BeautifulSoup     #网页解析，获取数据
import re       #正则表达式，进行文字匹配
import urllib.request,urllib.error      #制定URL，获取网页数据
import xlwt     #进行excel操作
import sqlite3  #进行SQLite数据库操作
def main():
    baserl = 'https://movie.douban.com/top250?start='
    url1 = getat(baserl)
fike = re.compile(r'<a href="(.*?)">')
def getat(baserl):
    for i in range(0,10):
        url = baserl+str(25*i)
        html = gat(url)
        soup = BeautifulSoup(html,'html.parser')
        for item in soup('div',class_='item'):
            item = str(item)
            save = []
            like = re.findall(fike,item)[0]
            print(like)
def gat(url):
    # global html
    head = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36 Edg/91.0.864.48"
        ''}
    a =urllib.request.Request(url,headers=head)
    html = ''
    try:
        response = urllib.request.urlopen(a)
        html = response.read().docode('utf-8')
    except Exception as e:
        print(e)
    return html
 
if __name__ == '__main__':
    main()
    print('爬完')

输出：

'bytes' object has no attribute 'docode'
'bytes' object has no attribute 'docode'
'bytes' object has no attribute 'docode'
'bytes' object has no attribute 'docode'
'bytes' object has no attribute 'docode'
'bytes' object has no attribute 'docode'
'bytes' object has no attribute 'docode'
'bytes' object has no attribute 'docode'
'bytes' object has no attribute 'docode'
'bytes' object has no attribute 'docode'
爬完

一看就知道是打错了。
改正后代码：

from bs4 import BeautifulSoup     #网页解析，获取数据
import re       #正则表达式，进行文字匹配
import urllib.request,urllib.error      #制定URL，获取网页数据
import xlwt     #进行excel操作
import sqlite3  #进行SQLite数据库操作
def main():
    baserl = 'https://movie.douban.com/top250?start='
    url1 = getat(baserl)
fike = re.compile(r'<a href="(.*?)">')
def getat(baserl):
    for i in range(0,10):
        url = baserl+str(25*i)
        html = gat(url)
        soup = BeautifulSoup(html,'html.parser')
        for item in soup('div',class_='item'):
            item = str(item)
            save = []
            like = re.findall(fike,item)[0]
            print(like)
def gat(url):
    # global html
    head = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36 Edg/91.0.864.48"
        ''}
    a =urllib.request.Request(url,headers=head)
    html = ''
    try:
        response = urllib.request.urlopen(a)
        html = response.read().decode('utf-8')
    except Exception as e:
        print(e)
    return html
 
if __name__ == '__main__':
    main()
    print('爬完')

本人实测可正常执行。

本回答被题主选为最佳回答 , 对您是否有帮助呢?

查看更多回答(1条)

报告相同问题？

关注问题

简单的except报错问题，次代码一直报错无法正常获取网址 python
2021-09-20 15:56

回答 2 已采纳分析过程：使用以下代码，可以追踪错误信息。 from bs4 import BeautifulSoup #网页解析，获取数据 import re #正则表达式，进行文字匹配 imp
代码运行成功后，运行没有报错，但无法打开浏览器 python selenium
2021-12-31 15:21

回答 4 已采纳你这种就只能debug了, 一些配置看不见, 最可能的问题就是 driver_path找不到, 你在第33行加入一行打印试试 print(url, self.driver_path, self.dri
python服务端报错 python 有问必答服务器
2023-03-08 12:42

回答 2 已采纳 “Devil组”引证GPT后的撰写：是因为尝试将一个函数或方法对象转换为整数，导致了TypeError: int() argument must be a string, a bytes-like o
python报错输出到日志_Python打印详细报错日志，获取报错信息位置行数
2020-12-12 14:39

weixin_39631370的博客源于：功能类代码 – Logsetclass.py网上代码1：# 日志模块import loggingimport traceback# 引入日志logging.basicConfig(filename='log_record.txt',level=logging.DEBUG, filemode='w', format='[%(asctime)s] [%...
python爬虫代码报错，count=0提示语法错误。 python
2021-04-09 15:19

回答 1 已采纳 count = 0的上面一行，少了一个右括号，如下： print(tplt.format({"序号", "价格", "商品名称"}) 还有：这一行的单引号也有问题： start_url =’
用if定义了变量，报错？（python) python
2022-08-31 21:25

回答 5 已采纳在80行下面加一行second=0
Python报错：AttributeError: 'HomeSpider' object has no attribute 'get_page_all', 请教各位? python
2021-09-02 17:51

回答 2 已采纳后面那几个成员函数缩进不对,应该在class内部而不是和class同级
python篇---python打印报错行
2023-10-16 14:39

心惠天意的博客 python篇---python打印报错行
Python+Selenium back()返回后运行driver.find_element_by_link_text(u'下一页')报错 python selenium 有问必答
2021-07-24 21:33

回答 3 已采纳 Message: no such element: Unable to locate element: 这是找不到内容是下一页的元素你换一种别的方式获取下一页的元素试试。
爬虫代理池中proxypool.方法/函数报错 python 爬虫
2021-08-19 19:34

回答 1 已采纳看老催的书，我们就是朋友。我是看他的视频入门的。那个代理池我改过，你这个属于模块导入错误，找一下就好了，有帮助记得采纳哦
爬虫json报错解决方法 python 爬虫
2022-12-09 10:07

回答 1 已采纳 worldDataStr不是标准的json格式，所以用json.loads 会报错
python如何在报错后继续运行_python怎么让程序重复运行
2020-12-28 20:25

weixin_39713219的博客 python让程序重复运行的方法：1、报错后，重新启动【.py】文件，继续执行；2、重复执行本【.py】文件中的内容；3、异常调用函数本身。python让程序重复运行的方法：方法一：报错后，重新启动.py文件，继续执行while ...
复制的python爬虫，在自己电脑上运行会报错，这个怎么解决啊？ python 有问必答
2021-05-23 17:47

回答 2 已采纳代码运行没有问题，检查一下requests版本和bs4版本，可以考虑升级一下。测试通过环境python3.7.6,bs4,'4.9.1',requests,'2.23.0'
Python用try-except的时候获取错误行信息和文件信息
2022-05-30 14:13

ftzchina的博客用try-except快速便捷的打印出错误信息，错误代码行数和发生错误的所在文件
python报错提示以及logger的一些应用
2023-05-05 22:01

悟乙己的博客本篇是拆解这篇【name】将报错 + logger提示拿出来。
没有解决我的问题, 去提问

问题事件

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
系统已结题 9月28日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
已采纳回答 9月20日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
创建了问题 9月20日

悬赏问题

¥15 关于#java#的问题：找一份能快速看完mooc视频的代码
¥15 这种微信登录授权谁可以做啊
¥15 请问我该如何添加自己的数据去运行蚁群算法代码
¥20 用HslCommunication 连接欧姆龙 plc有时会连接失败。报异常为“未知错误”
¥15 网络设备配置与管理这个该怎么弄
¥20 机器学习能否像多层线性模型一样处理嵌套数据
¥20 西门子S7-Graph,S7-300，梯形图
¥50 用易语言http 访问不了网页
¥50 safari浏览器fetch提交数据后数据丢失问题
¥15 matlab不知道怎么改，求解答！！

简单的except报错问题，次代码一直报错无法正常获取网址

2条回答 默认 最新

问题事件

悬赏问题

2条回答默认最新