cctwsnbb 2022-10-11 14:28 采纳率: 0%
浏览 20

js逆向爬取数据时出现的问题

在尝试爬取该数据的时候,通过分析源码找到了加密的函数webInstace,进入该函数之后将整个webInstace的源码通过ctrl+A复制下来并整成js文件,但在代码运行的时候报了这两个错误。
在单步调试的时候,webInstace函数中传入的参数显示为f(_0xa0c834),而在源码最后也能得到该结果,但这段源码复制到开发者工具中的console中运行就报了

Uncaught SyntaxError: Invalid or unexpected token at <anonymous>:1:18

的错误,请问应该怎么解决??
下面是源代码:

import requests
import execjs
 
url = 'https://www.endata.com.cn/API/GetData.ashx'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36'
}
 
data = {
    'year':'2022',
    'MethodName':'BoxOffice_GetYearInfoData'
}
res = requests.post(url,data=data,headers=headers,timeout=5)
restext = res.text
with open('jsdata.js','r',encoding='utf-8') as f:
    js = f.read()
    # print(js)
resp = execjs.compile(js)
response = resp.call('webInstace.shell',restext)
print(response)

报错信息:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\threading.py", line 926, in _bootstrap_inner
    self.run()
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1267, in _readerthread
    buffer.append(fh.read())
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa5 in position 112: illegal multibyte sequence
 
Traceback (most recent call last):
  File "C:/Users/24681/Desktop/python file/caicai/untitled/爬虫/JS逆向/艺恩电影票房.py", line 19, in <module>
    response = resp.call('webInstace.shell',restext)
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\site-packages\execjs\_abstract_runtime_context.py", line 37, in call
    return self._call(name, *args)
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\site-packages\execjs\_external_runtime.py", line 92, in _call
    return self._eval("{identifier}.apply(this, {args})".format(identifier=identifier, args=args))
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\site-packages\execjs\_external_runtime.py", line 78, in _eval
    return self.exec_(code)
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\site-packages\execjs\_abstract_runtime_context.py", line 18, in exec_
    return self._exec_(source)
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\site-packages\execjs\_external_runtime.py", line 87, in _exec_
    output = self._exec_with_pipe(source)
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\site-packages\execjs\_external_runtime.py", line 103, in _exec_with_pipe
    stdoutdata, stderrdata = p.communicate(input=input)
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 964, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "C:\Users\24681\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1317, in _communicate
    stdout = stdout[0]
IndexError: list index out of range
 
  • 写回答

1条回答 默认 最新

  • CSDN-Ada助手 CSDN-AI 官方账号 2022-10-11 15:05
    关注
    评论

报告相同问题?

问题事件

  • 创建了问题 10月11日

悬赏问题

  • ¥15 华为手机相册里面的照片能够替换成自己想要的照片吗?
  • ¥15 陆空双模式无人机飞控设置
  • ¥15 sentaurus lithography
  • ¥100 求抖音ck号 或者提ck教程
  • ¥15 关于#linux#的问题:子进程1等待子进程A、B退出后退出(语言-c语言)
  • ¥20 web页面如何打开Outlook 365的全球离线通讯簿功能
  • ¥15 io.jsonwebtoken.security.Keys
  • ¥15 急,ubuntu安装后no caching mode page found等
  • ¥15 联想交换机NE2580O/NE1064TO安装SONIC
  • ¥15 防火墙的混合模式配置