indexerror:列表索引超出范围

39行出现indexerror：list index out of range

# url = 'https://wenku.baidu.com/view/dcfab8bff705cc175527096e.html'

import json
import requests
import re
from selenium import webdriver
import time


def open_url(url):
    print('开始自动查询网页')
    browser = webdriver.Chrome()

    browser.get(url)
    print('等待5秒')
    time.sleep(2)

    # 下面这个语句并不是查找“继续阅读”按钮所在位置，而是更上面的元素，因为按照原本元素位置滑动窗口会遮挡住
    eles = browser.find_element_by_xpath('//*[@id="html-reader-go-more"]/div[1]/div[3]/div[1]')
    browser.execute_script('arguments[0].scrollIntoView();', eles)
    print('等待2秒')
    time.sleep(2)

    # 点击“继续阅读”按钮
    browser.find_element_by_xpath('//*[@id="html-reader-go-more"]/div[2]/div[1]/span/span[2]').click()
    print('已显示文档所有内容')


def fetch_url(url):
    '''网页源代码'''
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36'}
    session = requests.session()
    resp = session.get(url, headers=headers).content.decode('utf8')
    return resp


def get_id(resp):
    url_id = re.findall(r"docId.*?\:.*?\'(.*?)\'\,", resp)[0]
    return url_id


def get_type(resp):
    url_type = re.findall(r"docType.*?\:.*?\'(.*?)\'\,", resp)[0]
    return url_type


def get_title(resp):
    url_title = re.findall(r"title.*?\:.*?\'(.*?)\'\,", resp)[0]
    return url_title


def parse_txt(id):
    '''id: 之前爬取的 docid'''

    url1 = 'https://wenku.baidu.com/api/doc/getdocinfo?callback=cb&doc_id={}'.format(id)
    content1 = fetch_url(url1)
    md5 = re.findall('"md5sum":"(.*?)"', content1)[0]
    pn = re.findall('"totalPageNum":"(.*?)"', content1)[0]
    rsign = re.findall('"rsign":"(.*?)"', content1)[0]

    url2 = 'https://wkretype.bdimg.com/retype/text/' + id + '?rn=' + pn + '&type=txt' + md5 + '&rsign=' + rsign
    content2 = json.loads(fetch_url(url2))
    result = ''
    for items in content2:
        for item in items:
            result = result + item['c'].replace('\\r', '\r').replace('\\n', '\n')
    return result


def save_file(filename, content):
    with open(filename, 'w', encoding='utf-8') as f:
        f.write(content)
        print('已保存为:' + filename)


def parse_doc(content):
    result = ''
    url_list = re.findall('(https.*?0.json.*?)\\\\x22}', content)
    url_list = [addr.replace("\\\\\\/", "/") for addr in url_list]
    for url in url_list[:-5]:
        content = fetch_url(url)
        y = 0
        txtlists = re.findall('"c":"(.*?)".*?"y":(.*?),', content)
        for item in txtlists:
            if not y == item[1]:
                y = item[1]
                n = '\n'
            else:
                n = ''
            result += n
            result += item[0].encode('utf-8').decode('unicode_escape', 'ignore')
    return result


def main():
    url1 = input('请输入需要爬取的文库地址:')
    # open_url(url1)
    resp = fetch_url(url1)
    Id = get_id(resp)
    Type = get_type(resp)
    Title = get_title(resp)
    if Type == 'txt':
        result = parse_txt(Id)
        save_file(Title + '.txt', result)
    elif Type == 'doc':
        result = parse_doc(resp)
        save_file(Title + '.doc', result)


if __name__ == '__main__':
    main()

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
～白+黑新星创作者: python技术领域 2022-03-12 22:40
关注
说明你的re.findall返回的是个空列表,并没有匹配到值

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

python列表索引超出范围怎么解决,Python：IndexError：列表索引超出范围
2020-12-09 17:01

赵轩昂的博客 numbers, guess) File "C:\Users\Ryan\Downloads\Program # 2\Program # 2\lottery.py", line 79, in checkmatch if guess[i] == winning_numbers[i]: IndexError: list index out of range 解决方案 As the error ...
python列表索引超出范围 等于啥_python如何解决IndexError：列表索引超出范围？-问答-阿里云开发者社区-阿里云...
2020-12-09 17:01

weixin_39768388的博客我正在尝试为ucf101数据集生成密集流，但我不断收到以下错误：我尝试在第68行中将video_name.split('')[1]更改为video_name.split('')[0]，已编译...列表索引超出范围：import os,sys import numpy as np import cv...
python出现indexerror,Python：IndexError：列表索引超出范围错误
2021-03-02 08:05

林常润的博客 Updated, look bottom!... I get a IndexError: list index out of range Error.def makeInverseIndex(strlist):numStrList = list(enumerate(strlist))n = 0m = 0dictionary = {}while (n < len(strList...
indexerror_解决IndexError：Python中的列表索引超出范围
2020-09-16 03:09

culing2941的博客如果我们尝试访问白色并且如果使用print(color [5])访问白色，那么它将给我们一个错误，即IndexError：列表索引超出范围 。 Here we all can see that I’m trying to print the value of a list at an index ‘5...
【Python 已解决】列表索引超出范围–Python 中的IndexError: list index out of range 错误
2024-07-17 21:10

二川bro的博客【Python 已解决】列表索引超出范围–Python 中的IndexError: list index out of range 错误
python indexerror怎么办_Python IndexError：列表索引超出范围.无法通过索引访问
2021-02-09 03:21

刘照庆的博客要读取两个字节：CheckSumByte = [ b for b in ser.read(2)]print( CheckSumByte)print( type(CheckSumByte))print( str(len(CheckSumByte)))print( CheckSumByte[0])输出：[202, 87]2IndexError: list ind...
python列表索引超出范围 等于啥_python - IndexError:列表分配索引超出范围，Python
2020-12-09 17:00

weixin_39973271的博客它的工作方式应该是这样的：它需要两个列表。标记一些索引，最好居中。父母双方都切换标记索引。其他索引按顺序转到其父元素。如果该父元素中已经存在相同的元素，则它将映射并检查同一元素在其他父元素的位置并到达...
django models索引_django-models – Django模型“IndexError：列表索引超出范围”Pydev
2021-01-12 12:46

L7 Studio的博客我收到以下错误： File "C:\Python27\lib\site-packages\django\db\models\base.py", line 54, in __new__ kwargs = {"app_label": model_module.__name__.split('.')[-2]} IndexError: list index out of range 我...
IndexError: list index out of range—列表索引超出范围的完美解决方法
2024-08-22 08:00

默语佬的博客即列表索引超出范围的问题。这种错误在处理列表或数组时经常发生，尤其是在你尝试访问列表中不存在的元素时。本文将详细讲解这个错误的成因，并提供解决方案和预防措施。同时，我们还会展示代码示例，让你能够更好地...
IndexError: list index out of range | 列表索引超出范围完美解决方法
2024-08-17 21:47

默语佬的博客无论你是初学者还是经验丰富的开发者，这篇文章都将帮助你有效地避免和解决列表索引超出范围的问题。当你试图访问一个列表中不存在的索引时，Python会抛出IndexError。列表索引是从0开始的，因此有效的索引范围是0到...
python索引超出范围异常_python:索引器错误：列表索引超出范围
2020-12-11 03:35

weixin_39963341的博客我从github复制这段代码并尝试在python上运行它。...在错误：if(bool(sys.argv[1]) and bool(sys.argv[2])): IndexError: list index out of range编码：import timeimport RPi.GPIO as GPIOimpor...
尝试访问第N‘；个项目时，&Quot；IndexError：列表索引超出范围是否意味着我的列表少于N个项目？
2024-08-28 21:56

潮易的博客在Python编程中，`IndexError`是常见的错误类型之一，它表示你尝试访问列表中的某个索引，但该索引超出了列表的有效范围（即索引大于或等于列表的长度）。如果你的列表是通过用户输入或者其他外部数据源动态生成的，...
【Python】成功解决IndexError: list index out of range
2024-03-10 10:19

高斯小哥的博客别担心，本文为你揭秘其背后原因，并提供三种高效解决方案：检查索引值、使用循环遍历列表和异常处理。让你轻松摆脱这一常见错误，从此编程无忧！此外，还有进阶学习建议，助你成为Python高手！快来跟随我们，一起...
python index out of bounds,Python错误代码：IndexError：索引错误列表索引超出范围
2020-11-28 18:21

yuwennaxiansheng的博客 I'm trying to write a function in Python that simulates a horse race. While there's no winner, it clears the screen, shows the list of horses (all have index starting at zero). Then, on the line I've ...
列表索引超出范围 - Python 错误解决方法
2023-04-27 10:22

Q shen的博客在本文中，我们将讨论IndexError: list index out of rangePython 中的错误。...使用该函数时指定超出列表中索引的范围range()。在继续修复错误之前，让我们讨论一下索引在 Python 列表中的工作原理。
python字符串索引超出范围怎么解决,IndexError：列出python中字符串的索引超出范围...
2021-04-27 04:31

麻天龙的博客我想从这个数组中删除"hello... 我检查了len(token)的范围; 它是(0,5)。这是代码：token=['hi','hello','how','are','you']stop='hello'for i in range(len(token)):if(token[i]==stop):del(token[i])我尝试使用它作...
python列表索引超出范围 等于啥_Python：列表索引超出范围，但只是有时
2021-03-05 22:54

weixin_39731807的博客在问题是，当我检查当前3x3正方形中是否存在该值时，我偶尔会得到一个“IndexError:list index out of range”。我已经打印了值和当前单元格，并确定只有当列是倒数第二或倒数第7或第8列时才会发生问...
【Python】已解决：IndexError: list index out of range
2024-06-30 19:31

屿小夏的博客然而，由于列表索引的错误访问，导致程序抛出了IndexError。为了正确解决IndexError: list index out of range错误，我们需要在代码中添加适当的检查，确保索引访问在有效范围内。示例1：修正索引访问。
python索引超出范围异常_python错误：索引器错误：列表索引超出范围
2020-12-11 03:34

weixin_39850599的博客我需要一些基本的代码帮助，每次使用变量program_controls添加要存储在数组中的按钮列表时，我都会尝试创建一个新列表，其中包含变量self.add_programs的值。在当我尝试这个：self.add_programs = list()self.rows +...
mysql索引超出范围_Python MySQL索引器错误：列表索引超出范围
2021-01-19 13:17

weixin_39620252的博客正在获取IndexError: list index out of range Error。Python新手，完全初学者，希望能帮助您理解错误所在。在从时间表api获取json，我需要将其保存到mysql数据库import requestsimport urllib2from urllib2 import ...
没有解决我的问题, 去提问

问题事件

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
系统已结题 3月22日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
已采纳回答 3月14日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
修改了问题 3月12日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
修改了问题 3月12日
展开全部

indexerror:列表索引超出范围

1条回答 默认 最新

问题事件

1条回答默认最新