在家无聊,在家自学爬虫!照着书上的代码敲得,能运行,没有报错,但就是爬取不到图片!求大佬指点!
import urllib.request
import re
key_name = urllib.request.quote("笔记本电脑")
def savefile (data):
path ="C://Users/Administrator/Desktop/taobao_url.txt"
file = open(path,"a")
file.write(data+"\n")
file.close()
for p in range (0,6):
url ="https://s.taobao.com/search?q=" + key_name + \
"&imgfile=&ie=utf8&p4ppushleft=5%2C48"+"&s="+str(p*48)
datal = urllib.request.urlopen(url).read().decode("utf-8")
savefile(url)
pat = 'pic_url":"//(.*?)"'
img_url = re.compile(pat).findall(datal)
print(img_url)
for a_i in range (0,len(img_url)):
this_img = img_url[a_i]
this_img_url = "http://"+this_img
print(this_img_url)
img_path = "D:\imagetb"+str(p)+str(a_i)+".jpg"
urllib.request.urlretrieve(this_img_url,img_path)
我感觉应该是网址的问题,改了以后经常报错然后就改回来了!