出现问题:TypeError: POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.
源代码:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
pages = set()
def getLinks(pageUrl):
global pages
html = urlopen('https://www.gdufe.edu.cn{}',format(pageUrl))
bs = BeautifulSoup(html, 'html.parser')
for link in bs.find_all('a', href=re.compile('cn')):
if 'href' in link.attrs:
if link.attrs['href'] not in pages:
newPage = link.attrs['href']
print(newPage)
pages.add(newPage)
getLinks(newPage)
getLinks(' ')
运行后结果
尝试过通过encode(‘utf-8’)等方式,不知道是不是自己使用错误,结果还是错的