qq_37121681
超级肥佬
采纳率100%
2019-01-22 17:49

如何爬取这个音乐网站上的下载链接?网址:http://www.dj024.com

音乐网站:http://www.dj024.com 爬取“现场串烧”列表下的每一个音乐下载地址。源码里面的下载地址是异步加载的。

图片说明

可是怎么也获取不到json,访问如下地址获取的不是json,是html代码,设置“Content-Type: application/json”,用session都不行!

图片说明

求大神指教,最好贴出代码。

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

1条回答

  • weixin_42897178 weixin_42897178 2年前
    import requests
    import json
    
    url = 'http://www.dj024.com/music/getData.html'
    headers = {'Accept':'application/json, text/javascript, */*; q=0.01',
               'Accept-Encoding':'gzip, deflate',
               'Accept-Language':'zh-CN,zh;q=0.9',
               'Connection':'keep-alive',
               'Content-Length':'8',
               'Content-Type':'application/x-www-form-urlencoded; charset=UTF-8',
               'Cookie':'PHPSESSID=3msh15458cladsa4rs44blcqn6; Hm_lvt_5638419b16434ebb48ea2bdef1114b97=1548209089,1548209110; jy_home_user=think%3A%7B%22status%22%3A%221%22%2C%22data%22%3Anull%7D; jy_home_ListenRecord=think%3A%5B%2237393%22%2C%2237401%22%2C%2237348%22%5D; Hm_lpvt_5638419b16434ebb48ea2bdef1114b97=1548209462; jy_home___forward__=%2Fuser%2Faccount%2FgetActive.html%3Ft%3D0.5699561267137225',
               'Host':'www.dj024.com',
               'Origin':'http://www.dj024.com',
               'Referer':'http://www.dj024.com/topics/37401.html',
               'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
               'X-Requested-With':'XMLHttpRequest'}
    data = {'id':'37401'}
    html = requests.post(url,headers = headers,data = data).content
    print(html)
    

    字段是listen_url,有转义符,用replace替换了就行

    点赞 1 评论 复制链接分享