部署github上的weclone项目时遇到的问题

不知道有人遇到过吗？想知道如何解决。

完整日志为

(.venv) root@autodl-container-e5aa47b621-4bee74ab:~/autodl-tmp/WeClone# weclone-cli make-dataset
INFO 05-16 17:15:48 [__init__.py:239] Automatically detected platform cuda.
[WeClone] I | 17:15:50 | Loading configuration from: ./settings.jsonc
[WeClone] I | 17:15:50 | 聊天记录禁用词: ['例如 密码', '例如 姓名', '//.....']
[WeClone] I | 17:15:50 | 开始使用llm对数据打分
[INFO|configuration_utils.py:697] 2025-05-16 17:15:50,895 >> loading configuration file ./Qwen2.5-7B-Instruct/config.json
[INFO|configuration_utils.py:771] 2025-05-16 17:15:50,897 >> Model config Qwen2Config {
  "_name_or_path": "./Qwen2.5-7B-Instruct",
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 32768,
  "max_window_layers": 28,
  "model_type": "qwen2",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": null,
  "rope_theta": 1000000.0,
  "sliding_window": 131072,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 152064
}

[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:50,935 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:50,935 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:50,935 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:50,935 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:50,935 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:50,935 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:50,935 >> loading file chat_template.jinja
[INFO|tokenization_utils_base.py:2313] 2025-05-16 17:15:51,266 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|configuration_utils.py:697] 2025-05-16 17:15:51,267 >> loading configuration file ./Qwen2.5-7B-Instruct/config.json
[INFO|configuration_utils.py:771] 2025-05-16 17:15:51,269 >> Model config Qwen2Config {
  "_name_or_path": "./Qwen2.5-7B-Instruct",
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 32768,
  "max_window_layers": 28,
  "model_type": "qwen2",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": null,
  "rope_theta": 1000000.0,
  "sliding_window": 131072,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 152064
}

[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:51,270 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:51,270 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:51,270 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:51,270 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:51,270 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:51,270 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:15:51,270 >> loading file chat_template.jinja
[INFO|tokenization_utils_base.py:2313] 2025-05-16 17:15:51,584 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|2025-05-16 17:15:51] llamafactory.data.template:157 >> Add <|im_end|> to stop words.
[INFO|configuration_utils.py:697] 2025-05-16 17:15:51,618 >> loading configuration file ./Qwen2.5-7B-Instruct/config.json
[INFO|configuration_utils.py:697] 2025-05-16 17:15:51,618 >> loading configuration file ./Qwen2.5-7B-Instruct/config.json
[INFO|configuration_utils.py:771] 2025-05-16 17:15:51,619 >> Model config Qwen2Config {
  "_name_or_path": "./Qwen2.5-7B-Instruct",
  "architectures": [
    "Qwen2ForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 151643,
  "eos_token_id": 151645,
  "hidden_act": "silu",
  "hidden_size": 3584,
  "initializer_range": 0.02,
  "intermediate_size": 18944,
  "max_position_embeddings": 32768,
  "max_window_layers": 28,
  "model_type": "qwen2",
  "num_attention_heads": 28,
  "num_hidden_layers": 28,
  "num_key_value_heads": 4,
  "rms_norm_eps": 1e-06,
  "rope_scaling": null,
  "rope_theta": 1000000.0,
  "sliding_window": 131072,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.49.0",
  "use_cache": true,
  "use_sliding_window": false,
  "vocab_size": 152064
}

[INFO|image_processing_auto.py:301] 2025-05-16 17:15:51,621 >> Could not locate the image processor configuration file, will try to use the model config instead.
INFO 05-16 17:15:59 [config.py:585] This model supports multiple tasks: {'generate', 'embed', 'score', 'reward', 'classify'}. Defaulting to 'generate'.
INFO 05-16 17:15:59 [config.py:1697] Chunked prefill is enabled with max_num_batched_tokens=8192.
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:16:00,833 >> loading file vocab.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:16:00,833 >> loading file merges.txt
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:16:00,833 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:16:00,833 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:16:00,833 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:16:00,833 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2048] 2025-05-16 17:16:00,833 >> loading file chat_template.jinja
[INFO|tokenization_utils_base.py:2313] 2025-05-16 17:16:01,136 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|configuration_utils.py:1093] 2025-05-16 17:16:01,230 >> loading configuration file ./Qwen2.5-7B-Instruct/generation_config.json
[INFO|configuration_utils.py:1140] 2025-05-16 17:16:01,231 >> Generate config GenerationConfig {
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "pad_token_id": 151643,
  "repetition_penalty": 1.05,
  "temperature": 0.7,
  "top_k": 20,
  "top_p": 0.8
}

WARNING 05-16 17:16:01 [utils.py:2181] We must use the `spawn` multiprocessing start method. Overriding VLLM_WORKER_MULTIPROC_METHOD to 'spawn'. See https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html#python-multiprocessing for more information. Reason: CUDA is initialized
INFO 05-16 17:16:05 [__init__.py:239] Automatically detected platform cuda.
INFO 05-16 17:16:07 [core.py:54] Initializing a V1 LLM engine (v0.8.2) with config: model='./Qwen2.5-7B-Instruct', speculative_config=None, tokenizer='./Qwen2.5-7B-Instruct', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=3072, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto,  device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='xgrammar', reasoning_backend=None), observability_config=ObservabilityConfig(show_hidden_metrics=False, otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=None, served_model_name=./Qwen2.5-7B-Instruct, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=True, chunked_prefill_enabled=True, use_async_output_proc=True, disable_mm_preprocessor_cache=False, mm_processor_kwargs=None, pooler_config=None, compilation_config={"level":3,"custom_ops":["none"],"splitting_ops":["vllm.unified_attention","vllm.unified_attention_with_output"],"use_inductor":true,"compile_sizes":[],"use_cudagraph":true,"cudagraph_num_of_warmups":1,"cudagraph_capture_sizes":[512,504,496,488,480,472,464,456,448,440,432,424,416,408,400,392,384,376,368,360,352,344,336,328,320,312,304,296,288,280,272,264,256,248,240,232,224,216,208,200,192,184,176,168,160,152,144,136,128,120,112,104,96,88,80,72,64,56,48,40,32,24,16,8,4,2,1],"max_capture_size":512}
WARNING 05-16 17:16:08 [utils.py:2321] Methods determine_num_available_blocks,device_config,get_cache_block_size_bytes,initialize_cache not implemented in <vllm.v1.worker.gpu_worker.Worker object at 0x7f54bb068c10>
INFO 05-16 17:16:09 [parallel_state.py:954] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0
INFO 05-16 17:16:09 [cuda.py:220] Using Flash Attention backend on V1 engine.
INFO 05-16 17:16:09 [gpu_model_runner.py:1174] Starting to load model ./Qwen2.5-7B-Instruct...
WARNING 05-16 17:16:09 [topk_topp_sampler.py:63] FlashInfer is not available. Falling back to the PyTorch-native implementation of top-p & top-k sampling. For the best performance, please install FlashInfer.
Loading safetensors checkpoint shards:   0% Completed | 0/4 [00:00<?, ?it/s]
Loading safetensors checkpoint shards:   0% Completed | 0/4 [00:00<?, ?it/s]

ERROR 05-16 17:16:09 [core.py:343] EngineCore hit an exception: Traceback (most recent call last):
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 335, in run_engine_core
ERROR 05-16 17:16:09 [core.py:343]     engine_core = EngineCoreProc(*args, **kwargs)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 290, in __init__
ERROR 05-16 17:16:09 [core.py:343]     super().__init__(vllm_config, executor_class, log_stats)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 60, in __init__
ERROR 05-16 17:16:09 [core.py:343]     self.model_executor = executor_class(vllm_config)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/executor/executor_base.py", line 52, in __init__
ERROR 05-16 17:16:09 [core.py:343]     self._init_executor()
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/executor/uniproc_executor.py", line 47, in _init_executor
ERROR 05-16 17:16:09 [core.py:343]     self.collective_rpc("load_model")
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
ERROR 05-16 17:16:09 [core.py:343]     answer = run_method(self.driver_worker, method, args, kwargs)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/utils.py", line 2255, in run_method
ERROR 05-16 17:16:09 [core.py:343]     return func(*args, **kwargs)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py", line 136, in load_model
ERROR 05-16 17:16:09 [core.py:343]     self.model_runner.load_model()
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1177, in load_model
ERROR 05-16 17:16:09 [core.py:343]     self.model = get_model(vllm_config=self.vllm_config)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model
ERROR 05-16 17:16:09 [core.py:343]     return loader.load_model(vllm_config=vllm_config)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/model_loader/loader.py", line 444, in load_model
ERROR 05-16 17:16:09 [core.py:343]     loaded_weights = model.load_weights(
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/models/qwen2.py", line 490, in load_weights
ERROR 05-16 17:16:09 [core.py:343]     return loader.load_weights(weights)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 235, in load_weights
ERROR 05-16 17:16:09 [core.py:343]     autoloaded_weights = set(self._load_module("", self.module, weights))
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 187, in _load_module
ERROR 05-16 17:16:09 [core.py:343]     for child_prefix, child_weights in self._groupby_prefix(weights):
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 101, in _groupby_prefix
ERROR 05-16 17:16:09 [core.py:343]     for prefix, group in itertools.groupby(weights_by_parts,
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 98, in <genexpr>
ERROR 05-16 17:16:09 [core.py:343]     weights_by_parts = ((weight_name.split(".", 1), weight_data)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/model_loader/loader.py", line 420, in _get_all_weights
ERROR 05-16 17:16:09 [core.py:343]     yield from self._get_weights_iterator(primary_weights)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/model_loader/loader.py", line 403, in <genexpr>
ERROR 05-16 17:16:09 [core.py:343]     return ((source.prefix + name, tensor)
ERROR 05-16 17:16:09 [core.py:343]   File "/root/autodl-tmp/WeClone/.venv/lib/python3.10/site-packages/vllm/model_executor/model_loader/weight_utils.py", line 441, in safetensors_weights_iterator
ERROR 05-16 17:16:09 [core.py:343]     with safe_open(st_file, framework="pt") as f:
ERROR 05-16 17:16:09 [core.py:343] safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooSmall
ERROR 05-16 17:16:09 [core.py:343] 
CRITICAL 05-16 17:16:09 [core_client.py:269] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.
Killed

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
檀越@新空间 2025-05-16 17:22
关注
下午好🌅🌅🌅
本答案参考通义千问

根据提供的日志信息，你在部署 weclone 项目时遇到了一些问题，但日志中并未明确显示具体的错误或异常。为了更好地帮助你解决问题，以下是一些可能的原因和对应的解决方案：

可能的原因分析

环境依赖未正确安装

某些依赖库可能未正确安装或版本不匹配。

配置文件问题

settings.jsonc 或其他配置文件可能存在格式错误或缺失关键字段。

模型路径问题

模型文件路径可能指向了错误的目录或文件不存在。

特殊字符处理问题

日志中提到 `Add
解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

如何在 GitHub 上 fork 别人的项目
2024-09-10 19:42

ArChaoCODE的博客在 GitHub 网站上浏览或通过搜索找到你想要 fork 的项目。命令用于从上游仓库获取最新的更新，但不会自动合并到本地分支。假设你在自己的特定分支（如。命令用于将上游仓库的更新合并到当前分支。命令用于将远程仓库...
golang开发好用的github工具
2022-02-10 18:31

yunduan16的博客 github.com/gin-gonic/gin github.com/zeromicro/go-zero github.com/fvbock/endless 永不停止框架，可以接受系统信息，进行自己重启 Rest协议客户端 github.com/go-resty/resty 依赖注入 github....
GitHub上最火的40个Android开源项目源码
2013-09-05 18:12

GitHub上最火的40个Android开源项目源码，筛选了下，主要是把可以加到我们项目中的下了下来，这里有40个项目的描述，以及12个项目的源码，省的各位童鞋重复下载了，当然如果有童鞋需要其他的项目，我这里也给出了...
GitHub 上优质项目整理
2024-09-18 01:53

2401_87167773的博客（11）基于DataBinding框架，MVVM设计模式的一套快速开发库，整合Okhttp+RxJava+Retrofit+Glide等主流库，满足日常开发需求。（15）最最轻量级的新手引导库，能够快速为任何一个View创建一个遮罩层，支持单个页面，...
从GitHub上下载的C++项目如何运行？
2021-03-21 21:58

我绝不会放弃的博客前言学习编程的一个很好的方式就是阅读别人的...另外，文中提到各个工具在CSDN上都能找到安装教程，也有很多教学，不妨去了解一下，大家不用生硬的套用我这个流程，遇到问题就上网查，我就是这样摸索出这一套方法的。
2025 年 06 月 GitHub 十大热门项目排行榜
2025-07-12 11:32

一点一木的博客本期我们精选了十个最受关注的热门项目，涵盖 AI 编程助手、数据库智能访问、终端智能体、开源 CRM、本地大模型平台、多模态训练教程等方向，既有前沿技术，也有高效工具。这些项目不仅能助你提升开发效率，更是探索...
从GitHub上克隆项目到本地IDEA
2020-11-05 12:07

空城不空99的博客从GitHub上克隆项目到本地IDEA
2022年度GitHub中文Java项目排行榜Top 10
2022-11-07 18:40

JEECG低代码平台的博客 2022年度GitHub中文Java项目排行榜Top 10
GitHub上项目目录结构介绍
2022-11-14 00:14

码建工的博客 github的使用-目录解析
hexo部署到github上之后无法访问_Linux下使用 github+hexo 搭建个人博客02-hexo部署到Github Pages...
2020-12-01 01:06

weixin_39767322的博客之前的这篇文章《Linux下使用 github+hexo 搭建个人博客01-hexo搭建》，相信大家都知道怎么搭建 hexo ，怎么切换主题，并且完成了一篇博文的创建，以及 MarkDown 标记语法的用法。如果还不清楚或者不知道的，那就先...
没有解决我的问题, 去提问

问题事件

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
创建了问题 5月16日

部署github上的weclone项目时遇到的问题

3条回答 默认 最新

可能的原因分析

问题事件

3条回答默认最新