问题遇到的现象和发生背景
跟着书里的例子学习bs4,通过向bs4.beautifulSoup()传递一个File对象后,type()其类型。发现问题:
- 按照书里讲述,type(exampleSoup)应该得到的是list类,而我返回的是:bs4.element.ResultSet, 请问是书写错了么?
- 当我在第二行加入print(exampleFile.read())后再运行,程序就提示我在type(elems[0])的list index out of range越界了。这又是怎么回事儿?
- 上面两个问题之间有关联么?
问题相关代码,请勿粘贴截图
exampleFile = open('example.html')
# print(exampleFile.read())
exampleSoup = bs4.BeautifulSoup(exampleFile,'html.parser')
print(type(exampleSoup))
elems = exampleSoup.select('#author')
print(type(elems))
print(len(elems))
print(type(elems[0]))
print(elems[0].getText())
print(str(elems[0]))
print(elems[0].attrs)
运行结果及报错内容
问题1. 第二行返回类型不是书中说的list
<class 'bs4.BeautifulSoup'>
<class 'bs4.element.ResultSet'>
1
<class 'bs4.element.Tag'>
Song Wei
<span id="author">Song Wei</span>
{'id': 'author'}
[Finished in 327ms]
问题2. 加入print(exampleFile.read())后,程序报了list越界错
<html>
<head><title>The Website Title</title></head>
<body>
<p>Download my <strong>Python</strong> book from <a href="http://www.baidu.com">my site</a>.</p>
<p class="slogan">Learn python the easy way!</p>
<p>By <span id="author">Song Wei</span></p>
</body>
</html>Traceback (most recent call last):
<class 'bs4.BeautifulSoup'>
<class 'bs4.element.ResultSet'>
0
File "D:\chwlsw\py-test\chapter11_web\mapIt.py", line 38, in <module>
print(type(elems[0]))
IndexError: list index out of range
[Finished in 301ms]
我想要达到的结果
请帮忙勘误,感谢!