我爬取豆瓣阅读其中一个子网页,返回的内容不太正常,跟页面源码不同,进行不了页面解析,请问这个是怎么回事呢。
hash_url = 'https://read.douban.com/category/1?sort=hot&page=1%27
resp = requests.get(hash_url, headers=headers)
print(resp.text)
打印出现
Ark.kindTree = [{"children": [{"children": [], "id": 501, "name": "\u8a00\u60c5\u5c0f\u8bf4"}, {"children": [], "id": 532, "name": "\u5973\u6027\u5c0f\u8bf4"}, {"children": [], "id": 508, "name": "\u60ac\u7591\u5c0f\u8bf4"}, {"children": [], "id": 506, "name": "\u5e7b\u60f3\u5c0f\u8bf4"}, {"children": [], "id": 505, "name": "\u79d1\u5e7b\u5c0f\u8bf4"},
等内容,请问是网站反爬策略吗
![](https://profile-avatar.csdnimg.cn/default.jpg!4)
爬取豆瓣阅读页面代码返回出现问题
- 写回答
- 好问题 0 提建议
- 追加酬金
- 关注问题
- 邀请回答
-
2条回答 默认 最新
- cjh4312 2023-09-02 19:26关注
需要这个吗
import requests import pandas as pd headers = { 'Referer': 'https://read.douban.com/category/1?sort=hot&page=1%27', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36', } json_data = { 'sort': 'hot', 'page': 1, 'kind': 1, 'query': '\n query getFilterWorksList($works_ids: [ID!]) {\n worksList(worksIds: $works_ids) {\n \n \n title\n cover(useSmall: false)\n url\n isBundle\n coverLabel(preferVip: true)\n \n \n url\n title\n\n \n author {\n name\n url\n }\n origAuthor {\n name\n url\n }\n translator {\n name\n url\n }\n\n \n abstract\n authorHighlight\n editorHighlight\n\n \n isOrigin\n kinds {\n \n name @skip(if: true)\n shortName @include(if: true)\n id\n \n }\n ... on WorksBase @include(if: true) {\n wordCount\n wordCountUnit\n }\n ... on WorksBase @include(if: false) {\n inLibraryCount\n }\n ... on WorksBase @include(if: false) {\n \n isEssay\n \n ... on EssayWorks {\n favorCount\n }\n \n \n \n averageRating\n ratingCount\n url\n isColumn\n isFinished\n \n \n \n }\n ... on EbookWorks @include(if: false) {\n \n ... on EbookWorks {\n book {\n url\n averageRating\n ratingCount\n }\n }\n \n }\n ... on WorksBase @include(if: false) {\n isColumn\n isEssay\n onSaleTime\n ... on ColumnWorks {\n updateTime\n }\n }\n ... on WorksBase @include(if: true) {\n isColumn\n ... on ColumnWorks {\n isFinished\n }\n }\n ... on EssayWorks {\n essayActivityData {\n \n title\n uri\n tag {\n name\n color\n background\n icon2x\n icon3x\n iconSize {\n height\n }\n iconPosition {\n x y\n }\n }\n \n }\n }\n highlightTags {\n name\n }\n ... on WorksBase @include(if: false) {\n fanfiction {\n tags {\n id\n name\n url\n }\n }\n }\n \n \n ... on WorksBase {\n copyrightInfo {\n newlyAdapted\n newlyPublished\n adaptedName\n publishedName\n }\n }\n\n isInLibrary\n ... on WorksBase @include(if: false) {\n \n fixedPrice\n salesPrice\n isRebate\n \n }\n ... on EbookWorks {\n \n fixedPrice\n salesPrice\n isRebate\n \n }\n ... on WorksBase @include(if: true) {\n ... on EbookWorks {\n id\n isPurchased\n isInWishlist\n }\n }\n ... on WorksBase @include(if: false) {\n fanfiction {\n fandoms {\n title\n url\n }\n }\n }\n ... on WorksBase @include(if: false) {\n fanfiction {\n kudoCount\n }\n }\n \n id\n isOrigin\n isEssay\n }\n }\n ', 'variables': {}, } response = requests.post('https://read.douban.com/j/kind/', headers=headers, json=json_data) data=pd.DataFrame(response.json()['list'])
本回答被题主选为最佳回答 , 对您是否有帮助呢?解决 1无用
悬赏问题
- ¥30 arduino vector defined in discarded section `.text' of wiring.c.o (symbol from plugin)
- ¥20 关于#c++#的问题:(2)运算二叉树·表达式一般由一个运算符和两个操作数组成:(相关搜索:二叉树遍历)
- ¥20 如何训练大模型在复杂因素组成的系统中求得最优解
- ¥15 关于#r语言#的问题:在进行倾向性评分匹配时,使用“match it"包提示”错误于eval(family$initialize): y值必需满足0 <= y <= 1“请问在进行PSM时
- ¥45 求17位带符号原码乘法器verilog代码
- ¥20 PySide6扩展QLable实现Word一样的图片裁剪框
- ¥15 matlab数据降噪处理,提高数据的可信度,确保峰值信号的不损失?
- ¥15 怎么看我在bios每次修改的日志
- ¥15 python+mysql图书管理系统
- ¥15 Questasim Error: (vcom-13)