可怜大学生 2024-05-09 18:13 采纳率: 100%
浏览 19
已结题

python爬取bilibili校园招聘网站

img


我的任务是爬取招聘网站,于是我选择了哔哩哔哩的校园招聘网站

这是我的程序代码

import requests
import pandas as pd


url="https://jobs.bilibili.com/api/campus/position/positionList"
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36 Edg/122.0.0.0',
    'Cookie': 'i-wanna-go-back=-1; FEED_LIVE_VERSION=V8; buvid4=550D880A-1CB9-1664-25A1-F1117788A69011197-023050922-AxNIFF5DjNAxs7K6kvypyQ%3D%3D; buvid_fp=ea251a3c8679523cc28f41599b8bde57; CURRENT_BLACKGAP=0; rpdid=|(JYl~)Ylkuk0J\'uY)JYRJ|JR; b_ut=5; header_theme_version=CLOSE; enable_web_push=DISABLE; PVID=3; CURRENT_FNVAL=4048; CURRENT_QUALITY=112; bili_ticket=eyJhbGciOiJIUzI1NiIsImtpZCI6InMwMyIsInR5cCI6IkpXVCJ9.eyJleHAiOjE3MTU0MTYxOTIsImlhdCI6MTcxNTE1NjkzMiwicGx0IjotMX0.uHjB2gDiqkcBLjQnO3G3B6WUE3wDKt2rdvualSijqYs; bili_ticket_expires=1715416132; SESSDATA=a5775916%2C1730725427%2Cd6873%2A51CjCE7jgCfZ-fD_4CNQOGkUhuIDGqa6O4RfW9xxRC5WETQ6Mcm0kRNKyIGmrMNRXhCjkSVjRUMGx1eGJIakJBQTlGU3Bfa0lqejFQOFU2TkxBc2cyMTAtcFN1al8wQUdIZ0dyX0t5X0cwVmNWTGotMldtOEFFVEpTTDM3UjV1YnpPTTFWdjE4NjJRIIEC; bili_jct=31ba1f87e551b419ae4449481795e379; DedeUserID=40761336; DedeUserID__ckMd5=b2b5f4025124b58b; sid=672rwk29; home_feed_column=5; browser_resolution=1488-742; _uuid=64DD6463-9495-D6101-2941-F385E93D7BA502613infoc; buvid3=788EC87C-9934-1078-057E-6516A00C7AC638273infoc; b_nut=1715180903; bsource=search_bing'
}
p={    'pageSize': 10,
    'pageNum': 1,
    'positionName': '',
    'postCode': [],
    'postCodeList': [],
    'workLocationList': [],
    'deptCodeList': [],
    'positionTypeList': ['3'],
    'practiceTypes': [],
    'recruitType': "null",
    'workTypeList': ['3'],
            }
rs = requests.post(url, data=p, headers=headers, verify=False)

print(rs)
print(rs.text)
#{"code":-101,"data":null,"message":"ajSessionId不能为空"}

这是我的输出结果

img

请问我应该怎么解决哇,谢谢各位

我想要有用的回答,机器人能不能不要来哇

  • 写回答

13条回答 默认 最新

  • cjh4312 2024-05-09 18:30
    关注

    把请求头写全

    import requests
    import pandas as pd
    
    cookies = {
        'buvid3': 'D85133D4-A6B8-0710-BF23-F86C0B0523FF32408infoc',
        'b_nut': '1699100432',
        '_uuid': 'D5BB523B-26DC-10A510-B419-E3C6F46C5C4233754infoc',
        'buvid4': 'A8F9DED3-D440-C8D8-28F0-A04B97B327A992567-023102911-ocwwVqjadHK7Sbj6XtFVfV3VFLVUR1DMMN7RySqpWuB4tMzdcYqDbw%3D%3D',
        'rpdid': '0zbfVFQ4gw|2W4N8tSG|3q3|3w1R1OZQ',
        'buvid_fp_plain': 'undefined',
        'DedeUserID': '397021855',
        'DedeUserID__ckMd5': '0d9635cf054040a1',
        'hit-dyn-v2': '1',
        'fingerprint': '152b8d5afe96137987381d0b7edd5ad5',
        'buvid_fp': '152b8d5afe96137987381d0b7edd5ad5',
        'CURRENT_FNVAL': '4048',
        'PVID': '2',
        'CURRENT_QUALITY': '80',
        'bp_video_offset_397021855': '911546318273380352',
        'OUTFOX_SEARCH_USER_ID_NCOO': '1785815010.8805866',
        'SESSDATA': '4b09897b%2C1730729182%2C97669%2A52CjDFLUgjd774gedHo_nk4ec5blFLmUq7hmuHlmzv37D_iHOC-or1FPnGZR4IQaYOgHoSVjd4RzhhTEhQRXhQWDFuUl9hMkp6VkdqdlQzOXFCYko1elZQT2dobzdUeHY5cHF3WjRfdlpRc1dMSW5jZlVhc25DZ1o3QWlPWldRYnhqdmtVTjRYUFpnIIEC',
        'bili_jct': '2088f85c8efbb048892970470cc613a6',
        'sid': '6n2a9yco',
        'b_lsid': 'D8E7E86E_18F5C929A06',
        'share_source_origin': 'QQ',
        'bsource': 'share_source_qqchat',
        'bili_ticket': 'eyJhbGciOiJIUzI1NiIsImtpZCI6InMwMyIsInR5cCI6IkpXVCJ9.eyJleHAiOjE3MTU1MDQyNjMsImlhdCI6MTcxNTI0NTAwMywicGx0IjotMX0.l5iaD7teJLyu659XmWKs0Z3UkpGaKrok0sZ5pMs4OI4',
        'bili_ticket_expires': '1715504203',
        'bp_t_offset_397021855': '911546318273380352',
    }
    
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36',
        'x-csrf': '61a269c1-588e-4880-b3cf-c4fdae079470',
        'x-usertype': '4',
    }
    
    json_data = {
        'pageSize': 10,
        'pageNum': 1,
        'positionName': '',
        'postCode': [],
        'postCodeList': [],
        'workLocationList': [],
        'workTypeList': [
            '3',
        ],
        'positionTypeList': [
            '3',
        ],
        'deptCodeList': [],
        'recruitType': None,
        'practiceTypes': [],
    }
    
    response = requests.post('https://jobs.bilibili.com/api/campus/position/positionList', cookies=cookies, headers=headers, json=json_data)
    dd=pd.DataFrame(response.json()['data']['list'])
    print(dd['positionName'])
    
    

    img

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论 编辑记录
查看更多回答(12条)

报告相同问题?

问题事件

  • 已结题 (查看结题原因) 5月11日
  • 已采纳回答 5月9日
  • 修改了问题 5月9日
  • 赞助了问题酬金15元 5月9日
  • 展开全部

悬赏问题

  • ¥20 关于多单片机模块化的一些问题
  • ¥30 seata使用出现报错,其他服务找不到seata
  • ¥35 引用csv数据文件(4列1800行),通过高斯-赛德尔法拟合曲线,在选取(每五十点取1点)数据,求该数据点的曲率中心。
  • ¥20 程序只发送0X01,串口助手显示不正确,配置看了没有问题115200-8-1-no,如何解决?
  • ¥15 Google speech command 数据集获取
  • ¥15 vue3+element-plus页面崩溃
  • ¥15 像这种代码要怎么跑起来?
  • ¥15 安卓C读取/dev/fastpipe屏幕像素数据
  • ¥15 pyqt5tools安装失败
  • ¥15 mmdetection