奋进小牛 2022-09-22 17:19 采纳率: 90.6%
浏览 728
已结题

爬虫遇到了问题:name 'headers' is not defined,请问如何解决?

爬虫遇到了问题:name 'headers' is not defined,请问如何解决?

from lxml import etree
import requests
import csv
import time
def spider():
    headers = {
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.81 Safari/537.36 SE 2.X MetaSr 1.0'
        }
pre_url='https://hefei.qfang.com/rent/f'
for x in range(1,13):
    html=requests.get(pre_url+str(x),headers=headers)
    time.sleep(2)#在每一次GET后,等待2秒
    selector=etree.HTML(html.text)
#先获取房源列表
house_list=selector.xpath("//*[@id='cycleListings']/ul/li")
for house in house_list:
    xiaoqu=house.xpath("div[2]/div[3]/div/a/text()")[0]
    huxing=house.xpath("div[2]/div[2]/p[1]/text()")[0]
    area=house.xpath("div[2]/div[2]/p[2]/text()")[0]
    month_price=house.xpath("div[3]/p/span[1]/text()")[0]
    people=house.xpath("div/div[2]/div[4]/div[1]/p/a/text()")[0]
    people_picture=house.xpath("/div/div[2]/div[4]/p/a/img/text()")[0]
item=[xiaoqu,huxing,area,month_price,people,people_picture]
data_writer(item)
print('正在抓取',xiaoqu)
def data_writer(item):
    with open()as csvfile:
        writer=csv.writer(csvfile)
        writer.writerow(item)
if __name__ == '__main__':
    spider()

  • 写回答

3条回答 默认 最新

  • honestman_ 2022-09-22 17:21
    关注

    缩进有问题:

    from lxml import etree
    import requests
    import csv
    import time
    
    
    def spider():
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.81 Safari/537.36 SE 2.X MetaSr 1.0'
        }
    
        
        pre_url = 'https://hefei.qfang.com/rent/f'
        for x in range(1, 13):
            html = requests.get(pre_url + str(x), headers=headers)
            time.sleep(2)  # 在每一次GET后,等待2秒
            selector = etree.HTML(html.text)
        # 先获取房源列表
        house_list = selector.xpath("//*[@id='cycleListings']/ul/li")
        for house in house_list:
            xiaoqu = house.xpath("div[2]/div[3]/div/a/text()")[0]
            huxing = house.xpath("div[2]/div[2]/p[1]/text()")[0]
            area = house.xpath("div[2]/div[2]/p[2]/text()")[0]
            month_price = house.xpath("div[3]/p/span[1]/text()")[0]
            people = house.xpath("div/div[2]/div[4]/div[1]/p/a/text()")[0]
            people_picture = house.xpath("/div/div[2]/div[4]/p/a/img/text()")[0]
        item = [xiaoqu, huxing, area, month_price, people, people_picture]
        data_writer(item)
        print('正在抓取', xiaoqu)
    
    
    def data_writer(item):
        with open()as csvfile:
            writer = csv.writer(csvfile)
            writer.writerow(item)
    
    
    if __name__ == '__main__':
        spider()
    
    
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

问题事件

  • 系统已结题 9月30日
  • 已采纳回答 9月22日
  • 创建了问题 9月22日

悬赏问题

  • ¥15 系统 24h2 专业工作站版,浏览文件夹的图库,视频,图片之类的怎样删除?
  • ¥15 怎么把512还原为520格式
  • ¥15 MATLAB的动态模态分解出现错误,以CFX非定常模拟结果为快照
  • ¥15 求高通平台Softsim调试经验
  • ¥15 canal如何实现将mysql多张表(月表)采集入库到目标表中(一张表)?
  • ¥15 wpf ScrollViewer实现冻结左侧宽度w范围内的视图
  • ¥15 栅极驱动低侧烧毁MOSFET
  • ¥30 写segy数据时出错3
  • ¥100 linux下qt运行QCefView demo报错
  • ¥50 F1C100S下的红外解码IR_RX驱动问题