jasonfzq 2024-04-20 15:29 采纳率: 40%
浏览 225

JSON parse error: Column() changed from object to array in row 0

使用BigBird模型过程中导入数据集阶段报错,求解答

WARNING:datasets.builder:Using custom data configuration default-433ca95c1a890d37
Downloading and preparing dataset json/default to /home/mat/.cache/huggingface/datasets/json/default-433ca95c1a890d37/0.0.0/a3e658c4731e59120d44081ac10bf85dc7e1388126b92338344ce9661907f253...
Downloading data files: 100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2584.29it/s]
Extracting data files: 100%|███████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 448.35it/s]
0 tables [00:00, ? tables/s]ERROR:datasets.packaged_modules.json.json:Failed to read file '/home/mat/project/1429.json' with error <class 'pyarrow.lib.ArrowInvalid'>: JSON parse error: Column() changed from object to array in row 0
Traceback (most recent call last):
  File "/home/mat/.local/lib/python3.6/site-packages/datasets/packaged_modules/json/json.py", line 110, in _generate_tables
    io.BytesIO(batch), read_options=paj.ReadOptions(block_size=block_size)
  File "pyarrow/_json.pyx", line 246, in pyarrow._json.read_json
  File "pyarrow/error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: JSON parse error: Column() changed from object to array in row 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "BigBird.py", line 29, in <module>
    dataset = load_dataset('json', data_files="./1429.json")
  File "/home/mat/.local/lib/python3.6/site-packages/datasets/load.py", line 1751, in load_dataset
    use_auth_token=use_auth_token,
  File "/home/mat/.local/lib/python3.6/site-packages/datasets/builder.py", line 705, in download_and_prepare
    dl_manager=dl_manager, verify_infos=verify_infos, **download_and_prepare_kwargs
  File "/home/mat/.local/lib/python3.6/site-packages/datasets/builder.py", line 793, in _download_and_prepare
    self._prepare_split(split_generator, **prepare_split_kwargs)
  File "/home/mat/.local/lib/python3.6/site-packages/datasets/builder.py", line 1275, in _prepare_split
    generator, unit=" tables", leave=False, disable=(not logging.is_progress_bar_enabled())
  File "/home/mat/.local/lib/python3.6/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/home/mat/.local/lib/python3.6/site-packages/datasets/packaged_modules/json/json.py", line 135, in _generate_tables
    f"Not able to read records in the JSON file at {file}. "
AttributeError: 'list' object has no attribute 'keys'

数据集查错后没有空值,不知道为何出错。

img

数据集可见图

  • 写回答

6条回答 默认 最新

  • 码农阿豪@新空间代码工作室 Java领域优质创作者 2024-04-20 15:31
    关注
    让阿豪来帮你解答,本回答参考chatgpt3.5编写提供,如果还有疑问可以评论或留言
    问题描述: 在使用BigBird模型时导入数据集阶段报错,错误提示为:Failed to read file '/home/mat/project/1429.json' with error : JSON parse error: Column() changed from object to array in row 0。 原因分析: 错误信息提示在读取1429.json文件的时候出现了问题,可能是该文件中的某些数据格式有误,不符合json数据格式的规范。 解决办法: 1. 检查数据集中是否存在非法字符或其它异常数据,排除非法数据。 2. 使用Python中的json模块检查json文件的格式是否符合规范,并进行必要的修复。 3. 使用Pandas库中的read_json方法读取json文件,可以忽略掉非法字符或其它异常数据。 示例代码: ``` import json import pandas as pd # 读取json文件并打印出异常数据 with open('1429.json') as f: for line in f: try: data = json.loads(line) except json.JSONDecodeError as e: print(e) # 使用Pandas读取json文件 df = pd.read_json('1429.json', lines=True, orient='records', encoding='utf-8', errors='ignore') print(df.head()) ```
    评论

报告相同问题?

问题事件

  • 创建了问题 4月20日

悬赏问题

  • ¥15 状态图的并发态问题咨询
  • ¥15 PFC3D,plot
  • ¥15 VAE模型编程报错无法解决
  • ¥100 基于SVM的信息粒化时序回归预测,有偿求解!
  • ¥15 物体组批优化问题-数学建模求解答
  • ¥15 微信原生小程序tabBar编译报错
  • ¥350 麦克风声源定位坐标不准
  • ¥15 apifox与swagger使用
  • ¥15 egg异步请求返回404的问题
  • ¥20 Ti毫米波雷达板同步