小顾想休息 2022-03-18 00:07 采纳率: 50%
浏览 948
已结题

代码运行报错'NoneType' object has no attribute 'get'

报错'NoneType' object has no attribute 'get'
# -*- coding: UTF-8 -*-
import json     
import requests

headers = {
        "Cookie": "user_trace_token=20171010163413-cb524ef6-ad95-11e7-85a7-525400f775ce; LGUID=20171010163413-cb52556e-ad95-11e7-85a7-525400f775ce; JSESSIONID=ABAAABAABEEAAJAA71D0768F83E77DA4F38A5772BDFF3E6; _gat=1; PRE_UTM=m_cf_cpt_baidu_pc; PRE_HOST=bzclk.baidu.com; PRE_SITE=http%3A%2F%2Fbzclk.baidu.com%2Fadrc.php%3Ft%3D06KL00c00f7Ghk60yUKm0FNkUsjkuPdu00000PW4pNb00000LCecjM.THL0oUhY1x60UWY4rj0knj03rNqbusK15yDLnWfkuWN-nj0sn103rHm0IHdDPbmzPjI7fHn3f1m3PDnsnH9anDFArH6LrHm3PHcYf6K95gTqFhdWpyfqn101n1csPHnsPausThqbpyfqnHm0uHdCIZwsT1CEQLILIz4_myIEIi4WUvYE5LNYUNq1ULNzmvRqUNqWu-qWTZwxmh7GuZNxTAn0mLFW5HDLP1Rv%26tpl%3Dtpl_10085_15730_11224%26l%3D1500117464%26attach%3Dlocation%253D%2526linkName%253D%2525E6%2525A0%252587%2525E9%2525A2%252598%2526linkText%253D%2525E3%252580%252590%2525E6%25258B%252589%2525E5%25258B%2525BE%2525E7%2525BD%252591%2525E3%252580%252591%2525E5%2525AE%252598%2525E7%2525BD%252591-%2525E4%2525B8%252593%2525E6%2525B3%2525A8%2525E4%2525BA%252592%2525E8%252581%252594%2525E7%2525BD%252591%2525E8%252581%25258C%2525E4%2525B8%25259A%2525E6%25259C%2525BA%2526xp%253Did%28%252522m6c247d9c%252522%29%25252FDIV%25255B1%25255D%25252FDIV%25255B1%25255D%25252FDIV%25255B1%25255D%25252FDIV%25255B1%25255D%25252FH2%25255B1%25255D%25252FA%25255B1%25255D%2526linkType%253D%2526checksum%253D220%26ie%3Dutf8%26f%3D8%26ch%3D2%26tn%3D98010089_dg%26wd%3D%25E6%258B%2589%25E5%258B%25BE%25E7%25BD%2591%26oq%3D%25E6%258B%2589%25E5%258B%25BE%25E7%25BD%2591%26rqlang%3Dcn%26oe%3Dutf8; PRE_LAND=https%3A%2F%2Fwww.lagou.com%2F%3Futm_source%3Dm_cf_cpt_baidu_pc; _putrc=347EB76F858577F7; login=true; unick=%E6%9D%8E%E5%87%AF%E6%97%8B; showExpriedIndex=1; showExpriedCompanyHome=1; showExpriedMyPublish=1; hasDeliver=63; TG-TRACK-CODE=index_search; _gid=GA1.2.1110077189.1507624453; _ga=GA1.2.1827851052.1507624453; LGSID=20171011082529-afc7b124-ae1a-11e7-87db-525400f775ce; LGRID=20171011082545-b94d70d5-ae1a-11e7-87db-525400f775ce; Hm_lvt_4233e74dff0ae5bd0a3d81c6ccf756e6=1507444213,1507624453,1507625209,1507681531; Hm_lpvt_4233e74dff0ae5bd0a3d81c6ccf756e6=1507681548; SEARCH_ID=e420ce4ae5a7496ca8acf3e7a5490dfc; index_location_city=%E5%8C%97%E4%BA%AC",
        "Host": "www.lagou.com",
        'Origin': 'https://www.lagou.com',
        'Referer': 'https://www.lagou.com/jobs/list_%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90?labelWords=&fromSearch=true&suginput=',
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.3408.400 QQBrowser/9.6.12028.400'}
post_data = {'first': 'false', 'kd':'数据分析师' }#这是请求网址的一些参数

def start_requests(pn):
    html = requests.post(myurl + str(pn), data=post_data, headers=headers, verify=False)
    html_text = html.text
    content = json.loads(html_text)  #loads()暂时可以理解为把json格式转为字典格式,而dumps()则是相反的
    pagesize = content.get('content').get('pageSize')    #这是Pytho字典中的get()用法
    return pagesize

def get_result(pagesize):
    for page in range(1, pagesize+1):
        content_next = json.loads(requests.post(myurl + str(page), data=post_data, headers=headers, verify=False).text)
        company_info = content_next.get('content').get('positionResult').get('result')
        if company_info:
            for p in company_info:
                line = str(p['city']) + ',' + str(p['companyFullName']) + ',' + str(p['companyId']) + ',' + \
                       str(p['companyLabelList']) + ',' + str(p['companyShortName']) + ',' + str(p['companySize']) + ',' + \
                       str(p['businessZones']) + ',' + str(p['firstType']) + ',' + str(
                    p['secondType']) + ',' + \
                       str(p['education']) + ',' + str(p['industryField']) +',' + \
                       str(p['positionId']) +',' + str(p['positionAdvantage']) +',' + str(p['positionName']) +',' + \
                       str(p['positionLables']) +',' + str(p['salary']) +',' + str(p['workYear']) + '\n'
                file.write(line)


if __name__ == '__main__':
    title = 'city,companyFullName,companyId,companyLabelList,companyShortName,companySize,businessZones,firstType,secondType,education,industryField,positionId,positionAdvantage,positionName,positionLables,salary,workYear\n'
    file = open('%s.txt' % '爬虫拉勾网', 'a')   #创建爬虫拉勾网.txt文件
    file.write(title)    #把title部分写入文件作为表头
    cityList = [u'北京', u'上海',u'深圳',u'广州',u'杭州',u'成都',u'南京',u'武汉',u'西安',u'厦门',u'长沙',u'苏州',u'天津',u'郑州']  #这里只选取了比较热门的城市,其他城市只几个公司提供职位
    for city in cityList:
        print('爬取%s' % city)
        myurl = 'https://www.lagou.com/jobs/positionAjax.json?px=default&city={}&needAddtionalResult=false&pn='.format(city)
        pagesize=start_requests(1)
        get_result(pagesize)
    file.close()

报错是这样的
AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_22456/784680725.py in <module>
     42         print('爬取%s' % city)
     43         myurl = 'https://www.lagou.com/jobs/positionAjax.json?px=default&city={}&needAddtionalResult=false&pn='.format(city)
---> 44         pagesize=start_requests(1)
     45         get_result(pagesize)
     46     file.close()

~\AppData\Local\Temp/ipykernel_22456/784680725.py in start_requests(pn)
     15     html_text = html.text
     16     content = json.loads(html_text)  #loads()暂时可以理解为把json格式转为字典格式,而dumps()则是相反的
---> 17     pagesize = content.get('content').get('pageSize')    #这是Pytho字典中的get()用法
     18     return pagesize
     19 

AttributeError: 'NoneType' object has no attribute 'get'


各位有知道原因的还望不吝赐教。

  • 写回答

2条回答 默认 最新

  • 关注

    你content字典中没有content属性
    content.get('content')获取的是 None
    你 print(content) 输出下字典看看获取的内容正确吗? 是不是错误信息
    可能是Cookie过期了,网站有反爬设置

    如有帮助,请点击我的回答下方的【采纳该答案】按钮帮忙采纳下,谢谢!

    img

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论 编辑记录
查看更多回答(1条)

报告相同问题?

问题事件

  • 系统已结题 4月1日
  • 已采纳回答 3月24日
  • 创建了问题 3月18日

悬赏问题

  • ¥15 Catia V5 R20 64位 安装过程中选择orbix配置创建套接字失败
  • ¥100 C51单片机设计交通灯时出现的问题
  • ¥15 R语言爬虫的时候元素和园代码不一样怎么解决呀
  • ¥15 VS2022多项目启动有问题
  • ¥15 SQL删除添加数据后序号不连续问题。
  • ¥15 首次运行OmniEvent运行报错
  • ¥15 有没有人知道这个问题怎么解决
  • ¥15 comsol电力电缆载流量仿真
  • ¥15 webSocket可以接TCP socket接口吗
  • ¥60 mpi并行出错,CFD++计算