weixin_45663087 2022-02-06 11:02 采纳率: 50%
浏览 54

使用spyder运行bert的run_classifier文件出现报错“\udcd5”

问题遇到的现象和发生背景

nlp小白使用spyder运行bert模型的glue项目中的MRPC数据集,但是报错UnicodeEncodeError: 'utf-8' codec can't encode character '\udcd5' in position 196: surrogates not allowed

模型输入参数如下:

--task_name="MRPC" 
--do_train="true" 
--do_eval="true" 
--data_dir="…//GLUE//glue_data//MRPC//" 
--vocab_file="…//GLUE//BERT_BASE_DIR//uncased_L-12_H-768_A-12//vocab.txt" 
--bert_config_file="…//GLUE//BERT_BASE_DIR//uncased_L-12_H-768_A-12//bert_config.json" 
--init_checkpoint="…//GLUE//BERT_BASE_DIR//uncased_L-12_H-768_A-12//bert_model.ckpt" 
--max_seq_length="128" 
--train_batch_size="1" 
--learning_rate="2e-5" 
--num_train_epochs="1.0" 
--output_dir="…//GLUE//output"

问题相关代码,请勿粘贴截图

run_classifier.py直接运行,代码在GitHub上google的bert项目即可下载

运行结果及报错内容
ERROR:tornado.general:Uncaught exception in ZMQStream callback
Traceback (most recent call last):
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\eventloop\zmqstream.py", line 431, in _run_callback
    callback(*args, **kwargs)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\ipykernel\iostream.py", line 126, in _handle_event
    event_f()
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\ipykernel\iostream.py", line 498, in _flush
    parent=self.parent_header, ident=self.topic)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\jupyter_client\session.py", line 742, in send
    to_send = self.serialize(msg, ident)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\jupyter_client\session.py", line 630, in serialize
    content = self.pack(content)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\jupyter_client\session.py", line 83, in <lambda>
    ensure_ascii=False, allow_nan=False,
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\utils\jsonapi.py", line 25, in dumps
    return json.dumps(o, **kwargs).encode("utf8")
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcd5' in position 196: surrogates not allowed
ERROR:tornado.general:Uncaught exception in zmqstream callback
Traceback (most recent call last):
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\eventloop\zmqstream.py", line 448, in _handle_events
    self._handle_recv()
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\eventloop\zmqstream.py", line 477, in _handle_recv
    self._run_callback(callback, msg)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\eventloop\zmqstream.py", line 431, in _run_callback
    callback(*args, **kwargs)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\ipykernel\iostream.py", line 126, in _handle_event
    event_f()
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\ipykernel\iostream.py", line 498, in _flush
    parent=self.parent_header, ident=self.topic)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\jupyter_client\session.py", line 742, in send
    to_send = self.serialize(msg, ident)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\jupyter_client\session.py", line 630, in serialize
    content = self.pack(content)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\jupyter_client\session.py", line 83, in <lambda>
    ensure_ascii=False, allow_nan=False,
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\utils\jsonapi.py", line 25, in dumps
    return json.dumps(o, **kwargs).encode("utf8")
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcd5' in position 196: surrogates not allowed
Exception in callback BaseAsyncIOLoop._handle_events(2244, 1)
handle: <Handle BaseAsyncIOLoop._handle_events(2244, 1)>
Traceback (most recent call last):
  File "C:\Users\40701\.conda\envs\test\lib\asyncio\events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\tornado\platform\asyncio.py", line 189, in _handle_events
    handler_func(fileobj, events)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\eventloop\zmqstream.py", line 448, in _handle_events
    self._handle_recv()
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\eventloop\zmqstream.py", line 477, in _handle_recv
    self._run_callback(callback, msg)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\eventloop\zmqstream.py", line 431, in _run_callback
    callback(*args, **kwargs)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\ipykernel\iostream.py", line 126, in _handle_event
    event_f()
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\ipykernel\iostream.py", line 498, in _flush
    parent=self.parent_header, ident=self.topic)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\jupyter_client\session.py", line 742, in send
    to_send = self.serialize(msg, ident)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\jupyter_client\session.py", line 630, in serialize
    content = self.pack(content)
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\jupyter_client\session.py", line 83, in <lambda>
    ensure_ascii=False, allow_nan=False,
  File "C:\Users\40701\.conda\envs\test\lib\site-packages\zmq\utils\jsonapi.py", line 25, in dumps
    return json.dumps(o, **kwargs).encode("utf8")
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcd5' in position 196: surrogates not allowed

我的解答思路和尝试过的方法

问了别人说是文件里面可能有中文,但是因为代码能力有限,找不到是什么文件。
然后我把MRPC文件夹中的文件用记事本打开以后,找到第196行,并未发现出现中文

我想要达到的结果

希望能把run_classifier文件跑通,不影响后续学习

  • 写回答

1条回答 默认 最新

  • CSDN专家-HGJ 2022-02-06 16:56
    关注

    从网上相关问题帖子看,可能是文件路径问题,检查文件路径,使用单斜杠作路径分隔符试试。

    评论

报告相同问题?

问题事件

  • 创建了问题 2月6日

悬赏问题

  • ¥15 对于相关问题的求解与代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料
  • ¥15 使用R语言marginaleffects包进行边际效应图绘制
  • ¥20 usb设备兼容性问题
  • ¥15 错误(10048): “调用exui内部功能”库命令的参数“参数4”不能接受空数据。怎么解决啊
  • ¥15 安装svn网络有问题怎么办