BIECHONGFU 2020-11-26 16:02 采纳率: 0%
浏览 47
已结题

爬取post请求数据出错

需要爬取的内容是http://drugs.dxy.cn/drug/89790/detail.htm药品成份下的三角按钮隐藏内容
报错内容是dwr.engine._remoteHandleException('2','0',{javaClassName:"java.lang.Throwable",message:"Error"});

{javaClassName:"java.lang.Throwable",message:"Error"}代替了我要爬取的内容

from urllib.parse import urlencode

from urllib import request
import urllib


params = {

        'callCount': '1',
        'page': '/drug/89790/detail.htm',
        'httpSessionId':'',
        'scriptSessionId':'D8B63E5C6C13BEC549EB1F56C5D7D79B627',
        'c0-scriptName': 'DrugUtils',

        'c0-methodName':'showDetail',
        'c0-id': '0',
        'c0-param0=number': '89790',
        'c0-param1=number': '2',
        'batchId': '2'

}

headers = {
        'Accept': '*/*',
        'Accept-Encoding': 'gzip, deflate',
        'Accept-Language':'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
        'Connection':'keep-alive',
        'Content-Length': '216',

        'Content-Type':'text/plain',
        'Cookie': '__auc=dc13ab771758c1e72b92c2c5a91; _ga=GA1.2.1809607133.1604546780; ifVisitOldVerBBS=false; __utmz=129582553.1605690840.25.2.utmcsr=baidu|utmccn=(organic)|utmcmd=organic; Hm_lvt_8a6dad3652ee53a288a11ca184581908=1605690096,1605691052; __utma=17875052.1809607133.1604546780.1605691237.1605691237.1; __utmz=17875052.1605691237.1.1.utmcsr=auth.dxy.cn|utmccn=(referral)|utmcmd=referral|utmcct=/; CLASS_CASTGC=8a9bb91b2f22c8050a8b5b9cf8fe899de049a4c8313f50cc8ab7fe38f19bee7ad20ca37293546c823f784320d432e3d44c33e18d9a3f9cbbd8a476a3aae95dbc80b7417db552938dc07b5f21952a298d39d3520ef384e70276ffb7b01a6e8c5dc59aba9ce897ec50dd87cbbd08d5f8f3e5f7fe4d6d20ba4d228a2be7cedf0ab1543342ad4f8d88a51e7fe66b80a5e7afd98e0872394d8e09791da512a6d5a2aa0cd7f830867e4f571e1beff081847e11a2b0fe09f3f04b0f903ee716a7cf535a7ffaed621c8512e9b108188b25610dbf9be6a34bf191541d635d02b6e7330d083475a56fdb03bc2095b8a568a3537e64472fcb944bfb1e9b1b9dc87075755447; JUTE_BBS_DATA=080438609523aa587120fdc1429f501793a91493312c1448f9ba416b15303e7cebe85430a698c719ac1a0bb30f13ed22501ef83c0c15f3d631cf8b79061a181d1b1454a49b7750887e18e90ed1e6f240; CMSSESSIONID=896FCEBDF678B51DBFFEC07E50953D16-n2; Hm_lvt_d1780dad16c917088dd01980f5a2cfa7=1605860503,1605860521,1606098267,1606187293; __utmc=129582553; route=d90bc6b4cfece12cbb65cc87d0d87858; DRUGSSESSIONID=0E86D2E367F456506C75E28B30BBA69F-n2; __asc=e57448a21760362702dd15cc875; __utma=129582553.290290514.1604373607.1606352101.1606374553.40; __utmt=1; __utmb=129582553.2.10.1606374553; Hm_lpvt_d1780dad16c917088dd01980f5a2cfa7=1606374556',
        'Host': 'drugs.dxy.cn',
        'Origin': 'http://drugs.dxy.cn',

        'Referer': 'http://drugs.dxy.cn/drug/89790/detail.htm',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.67 Safari/537.36 Edg/87.0.664.47'


    }
params = urllib.parse.urlencode(params).encode("utf-8")
base_url = 'http://drugs.dxy.cn/dwr/call/plaincall/DrugUtils.showDetail.dwr'
req = urllib.request.Request(base_url, data=params, headers=headers)
res = urllib.request.urlopen(req)

html = res.read().decode(encoding = "utf-8").strip()


print(html)
  • 写回答

2条回答 默认 最新

  • vigiles 2020-11-26 18:39
    关注

    另外 head里只设置个'User-Agent'就好了,其它的设置了反倒出问题

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 (希望可以解决问题)ma和mb文件无法正常打开,打开后是空白,但是有正常内存占用,但可以在打开Maya应用程序后打开场景ma和mb格式。
  • ¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
  • ¥20 腾讯企业邮箱邮件可以恢复么
  • ¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗?
  • ¥15 错误 LNK2001 无法解析的外部符号
  • ¥50 安装pyaudiokits失败
  • ¥15 计组这些题应该咋做呀
  • ¥60 更换迈创SOL6M4AE卡的时候,驱动要重新装才能使用,怎么解决?
  • ¥15 让node服务器有自动加载文件的功能
  • ¥15 jmeter脚本回放有的是对的有的是错的