qq_43415765
漫漫在努力
采纳率50%
2021-03-07 23:39

跟着教学视频用python爬取房天下数据出错

20
已采纳

我的代码

import requests as req

res=req.get("https://zj.esf.fang.com/")

 

from bs4 import BeautifulSoup

soup=BeautifulSoup(res.text,"html.parser")

 

houses = soup.select(".shop_list dl")

 

def getHouseInfo(url):

info = {}

soup = BeautifulSoup(req.get(url).text,"html.parser")

res = soup.select(".tab-cont-right .trl-item1")

print(res)

for re in res:

tmp = re.text.strip().split('\n')

info[tmp[1].strip()] = tmp[0].strip()

xiaoqu = soup.select(".rcont .blue")[0].text

info["小区名字"] = xiaoqu

zongjia = soup.select(".tab-cont-right .trl-item")

info["总价"] = zongjia[0].text

print(info)

 

getHouseInfo("https://zj.esf.fang.com/chushou/3_181595442.htm")

 

domain="https://zj.esf.fang.com"

#遍历返回的房屋信息

for house in houses:

#加try except异常处理

try:

print(domain+house.select(".clearfix a")[0]['href'])

except Exception as e:

print("---------->",e)

 

错误如下和教学视频爬出结果如下

 

 

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

3条回答

  • jslang 天际的海浪 1月前
    import requests as req
    res=req.get("https://zj.esf.fang.com/")
    from bs4 import BeautifulSoup
    soup=BeautifulSoup(res.text,"html.parser")
    houses = soup.select(".shop_list dl")
    
    def getHouseInfo(url):
        info = {}
        soup = BeautifulSoup(req.get(url).text,"html.parser")
        res = soup.select(".tab-cont-right .trl-item1")
        print(res)
        for re in res:
            tmp = re.text.strip().split('\n')
            info[tmp[-1].strip()] = tmp[0].strip()
        xiaoqu = soup.select(".rcont .blue")[0].text
        info["小区名字"] = xiaoqu
        zongjia = soup.select(".tab-cont-right .trl-item")
        info["总价"] = zongjia[0].text.strip()
        print(info)
    
    getHouseInfo("https://zj.esf.fang.com/chushou/3_181595442.htm?rfss=1-ca39b791988eaa89e8-1d")
    
    点赞 评论 复制链接分享
  • jslang 天际的海浪 1月前

    它这个地址https://zj.esf.fang.com/chushou/3_181595442.htm后面必须要带上?rfss=1-ca39b791988eaa89e8-1d参数

    getHouseInfo("https://zj.esf.fang.com/chushou/3_181595442.htm?rfss=1-ca39b791988eaa89e8-1d")

     

    import requests as req
    res=req.get("https://zj.esf.fang.com/")
    from bs4 import BeautifulSoup
    soup=BeautifulSoup(res.text,"html.parser")
    houses = soup.select(".shop_list dl")
    
    def getHouseInfo(url):
        info = {}
        soup = BeautifulSoup(req.get(url).text,"html.parser")
        res = soup.select(".tab-cont-right .trl-item1")
        print(res)
        for re in res:
            tmp = re.text.strip().split('\n')
            info[tmp[1].strip()] = tmp[0].strip()
            xiaoqu = soup.select(".rcont .blue")[0].text
            info["小区名字"] = xiaoqu
            zongjia = soup.select(".tab-cont-right .trl-item")
            info["总价"] = zongjia[0].text
        print(info)
    
    getHouseInfo("https://zj.esf.fang.com/chushou/3_181595442.htm?rfss=1-ca39b791988eaa89e8-1d")
    

    点赞 评论 复制链接分享
  • ProfSnail ProfSnail 1月前

    楼下的兄弟正解,url须补全。

    点赞 评论 复制链接分享