清夏灬夏晚秋 2020-11-17 20:16 采纳率: 0%
浏览 79

404,NOT Found,怎么解决

from bs4 import BeautifulSoup
import xlwt
import urllib.request
import urllib.error
import re


def main():
    baseurl = "http://www.xbiquge.la/1/1370/"
    # 爬取的网页
    datalist = getData(baseurl)
    save = ".//电影信息.xls"
    # saveData(datalist)

    # askURl("http://58921.com/film/new")


def getData(baseurl):
    datalist = []
    for i in range(0, 1):
        url = baseurl + str(i)
        html = askURl(url)  # 保存获取到的源码
    # soup = BeautifulSoup(html, "html.parser")
    return datalist


def askURl(url):
    head = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
                          "(KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36",
            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image"
                      "/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
            "Accept-Language": "zh-CN,zh;q=0.9",
            "Host": "www.xbiquge.la"}
    request = urllib.request.Request(url, headers=head)
    html = ""
    try:
        response = urllib.request.urlopen(request)
        html = response.read().decode("utf-8")
    except urllib.error.URLError as e:
        if hasattr(e, "code"):
            print(e.code)
        if hasattr(e, "reason"):
            print(e.reason)

    return html


def saveData(save):
    print("第一条")


if __name__ == "__main__":
    main()

 

 

 

这是代码,下面这个是爬取的站点

  • 写回答

1条回答 默认 最新

  • 放风喽 2020-11-18 04:58
    关注
    def getData(baseurl):
        datalist = []
        for i in range(0, 1):
            url = baseurl + str(i)
            html = askURl(url)  # 保存获取到的源码
        # soup = BeautifulSoup(html, "html.parser")
        return datalist

    这里将获得一个url,是http://www.xbiquge.la/1/1370/0

    这个url真的不存在啊

    评论

报告相同问题?

悬赏问题

  • ¥30 vmware exsi重置后登不上
  • ¥15 易盾点选的cb参数怎么解啊
  • ¥15 MATLAB运行显示错误,如何解决?
  • ¥15 c++头文件不能识别CDialog
  • ¥15 Excel发现不可读取的内容
  • ¥15 关于#stm32#的问题:CANOpen的PDO同步传输问题
  • ¥20 yolov5自定义Prune报错,如何解决?
  • ¥15 电磁场的matlab仿真
  • ¥15 mars2d在vue3中的引入问题
  • ¥50 h5唤醒支付宝并跳转至向小荷包转账界面