我的图片是用url格式传入的
[ {
"id" : "2500010_1",
"conversations" : [ {
"from" : "user",
"value" : "FabricDesign Yes: <|vision_start|>https://deti.cn-sh2.ufileos.com/PC%2F9db094a2-2e8c-4fa1-b931-50f9e25c9c98.png?UCloudPublicKey=TOKEN_696b1902-38dd-411f-b9f6-18c454ec200c&Signature=x4SPImsAbXI3EctpIQwYKu2ZfnM%3D&Expires=1746687410<|vision_end|>"
}, {
"from" : "assistant",
"value" : "女生-上衣,型号:2500010,颜色:深灰,质量等级:合格品,服务类型:打版+采购+生产,生产类型:包工包料"
} ]
}, {
"id" : "2500010_2",
"conversations" : [ {
"from" : "user",
"value" : "FabricDesign Yes: <|vision_start|>https://deti.cn-sh2.ufileos.com/PC%2F5c21c566-d60d-4679-9d0b-9423d00f585f.png?UCloudPublicKey=TOKEN_696b1902-38dd-411f-b9f6-18c454ec200c&Signature=WxAMVLeCtL0UQKdQwRWU18V6wnI%3D&Expires=1746675952<|vision_end|>"
}, {
"from" : "assistant",
"value" : "主料A:50单面 | 编号:50单面 | 成分:90%棉 | 颜色:精白 | 幅宽:20*20 | 单位:米 | 克重:20 | 供应商:(13632374222) | 含税价:17.80 | 特殊工艺:"
} ]
然后报错了
D:\Anaconda\envs\wt\python.exe D:\xm\wt\train.py
swanlab: Tracking run with swanlab version 0.5.7
swanlab: Run data will be saved locally in D:\xm\wt\swanlog\run-20250507_172653-2126b24a
swanlab: 👋 Hi xun, welcome to swanlab!
swanlab: Syncing run Qwen/Qwen2-VL-2B-Instruct to the cloud
swanlab: 🏠 View project at https://swanlab.cn/@xun/qwen-finetune
swanlab: 🚀 View run at https://swanlab.cn/@xun/qwen-finetune/runs/ybsxv8axwjtlelg87rmm0
2025-05-07 17:26:54,534 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
`Qwen2VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 9.83it/s]
Generating train split: 819 examples [00:00, 34821.44 examples/s]
Map: 7%|▋ | 57/819 [00:09<02:11, 5.81 examples/s]
swanlab: Error happened while training
swanlab: 🏠 View project at https://swanlab.cn/@xun/qwen-finetune
swanlab: 🚀 View run at https://swanlab.cn/@xun/qwen-finetune/runs/ybsxv8axwjtlelg87rmm0
File "D:\xm\wt\train.py", line 146, in <module>
train_dataset = train_ds.map(process_func)
File "D:\Anaconda\envs\wt\lib\site-packages\datasets\arrow_dataset.py", line 557, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "D:\Anaconda\envs\wt\lib\site-packages\datasets\arrow_dataset.py", line 3079, in map
for rank, done, content in Dataset._map_single(**dataset_kwargs):
File "D:\Anaconda\envs\wt\lib\site-packages\datasets\arrow_dataset.py", line 3501, in _map_single
for i, example in iter_outputs(shard_iterable):
File "D:\Anaconda\envs\wt\lib\site-packages\datasets\arrow_dataset.py", line 3475, in iter_outputs
yield i, apply_function(example, i, offset=offset)
File "D:\Anaconda\envs\wt\lib\site-packages\datasets\arrow_dataset.py", line 3398, in apply_function
processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
File "D:\xm\wt\train.py", line 56, in process_func
image_inputs, video_inputs = process_vision_info(messages) # 获取数据数据(预处理过)
File "D:\Anaconda\envs\wt\lib\site-packages\qwen_vl_utils\vision_process.py", line 330, in process_vision_info
image_inputs.append(fetch_image(vision_info))
File "D:\Anaconda\envs\wt\lib\site-packages\qwen_vl_utils\vision_process.py", line 91, in fetch_image
image_obj = Image.open(requests.get(image, stream=True).raw)
File "D:\Anaconda\envs\wt\lib\site-packages\PIL\Image.py", line 3572, in open
raise UnidentifiedImageError(msg)
cannot identify image file <_io.BytesIO object at 0x0000029BEBD236A0>
是数据集格式有问题吗,标准的格式是什么样子的?