m0_73786595 2025-01-20 12:19 采纳率: 25%
浏览 81

ChatGLM4-9B推理报错


Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 10/10 [00:11<00:00,  1.10s/it]
Some parameters are on the meta device because they were offloaded to the cpu.
Welcome to the GLM-4-9B CLI chat. Type your messages below.

You: hello
GLM-4:
Exception in thread Thread-2 (generate):
Traceback (most recent call last):
  File "D:\conda\envs\env_name\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "D:\conda\envs\env_name\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "D:\conda\envs\env_name\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "D:\conda\envs\env_name\lib\site-packages\transformers\generation\utils.py", line 1622, in generate
    result = self._sample(
  File "D:\conda\envs\env_name\lib\site-packages\transformers\generation\utils.py", line 2841, in _sample
    model_kwargs = self._update_model_kwargs_for_generation(
  File "C:\Users\Administrator\.cache\huggingface\modules\transformers_modules\glm-4-9b-chat\modeling_chatglm.py", line 929, in _update_model_kwargs_for_generation
    cache_name, cache = self._extract_past_from_model_output(outputs)
ValueError: too many values to unpack (expected 2)
Traceback (most recent call last):
  File "E:\ChatGLM3\basic_demo\trans_cli_demo.py", line 112, in <module>
    for new_token in streamer:
  File "D:\conda\envs\env_name\lib\site-packages\transformers\generation\streamers.py", line 223, in __next__
    value = self.text_queue.get(timeout=self.timeout)
  File "D:\conda\envs\env_name\lib\queue.py", line 179, in get
    raise Empty
_queue.Empty

transformers包换了4.40.0(官方要求), 4.40.2, 4.39.3都没用,不知怎么办了

  • 写回答

2条回答 默认 最新

  • master_chenchengg 2025-01-23 23:31
    关注

    答案来星火AI,希望能对题主有所帮助!

    要解决 ChatGLM4-9B 推理报错的问题,首先需要了解具体的错误信息和上下文。通常,这类错误可能与模型加载、输入数据格式、依赖库版本或配置有关。以下是一些常见的解决方法:

    1. 检查模型文件路径

    确保你提供的模型文件路径是正确的,并且文件存在且未损坏。

    import os
    
    model_path = "path/to/your/model"
    if not os.path.exists(model_path):
        raise FileNotFoundError(f"Model file not found at {model_path}")
    

    2. 确认依赖库版本

    确保所有必要的依赖库已经正确安装,并且版本兼容。例如,如果你使用的是 transformers 库,可以通过以下命令检查版本:

    pip show transformers
    

    如果版本不兼容,可以尝试升级或降级到合适的版本:

    pip install transformers==4.20.0
    

    3. 检查输入数据格式

    确保输入数据的格式符合模型的要求。例如,对于文本生成任务,输入应该是一个字符串列表。

    inputs = ["Hello, how are you?"]
    

    4. 调试日志记录

    启用详细的日志记录,以便更好地调试和排查问题。

    import logging
    
    logging.basicConfig(level=logging.DEBUG)
    logger = logging.getLogger(__name__)
    
    def log_debug_info(message):
        logger.debug(message)
    

    5. 捕获并处理异常

    在代码中添加异常捕获机制,以便更好地理解错误原因。

    try:
        # Your inference code here
    except Exception as e:
        logger.error("An error occurred during inference", exc_info=True)
    

    6. 查看官方文档和社区支持

    查阅 ChatGLM4-9B 的官方文档和社区支持,以获取更多帮助。例如,可以访问 Hugging Face 的 Transformers 文档 或相关论坛。

    示例代码

    以下是一个完整的示例代码,展示了如何加载模型并进行推理,同时包含错误处理和日志记录:

    import os
    import logging
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    # 设置日志记录
    logging.basicConfig(level=logging.DEBUG)
    logger = logging.getLogger(__name__)
    
    def log_debug_info(message):
        logger.debug(message)
    
    # 检查模型文件路径
    model_path = "path/to/your/model"
    if not os.path.exists(model_path):
        raise FileNotFoundError(f"Model file not found at {model_path}")
    
    # 加载模型和分词器
    try:
        tokenizer = AutoTokenizer.from_pretrained(model_path)
        model = AutoModelForCausalLM.from_pretrained(model_path)
        log_debug_info("Model and tokenizer loaded successfully")
    except Exception as e:
        logger.error("Failed to load model or tokenizer", exc_info=True)
        raise e
    
    # 准备输入数据
    inputs = ["Hello, how are you?"]
    try:
        inputs = tokenizer(inputs, return_tensors="pt")
        log_debug_info("Input data prepared successfully")
    except Exception as e:
        logger.error("Failed to prepare input data", exc_info=True)
        raise e
    
    # 进行推理
    try:
        outputs = model.generate(**inputs)
        log_debug_info("Inference completed successfully")
    except Exception as e:
        logger.error("Inference failed", exc_info=True)
        raise e
    
    # 输出结果
    try:
        result = tokenizer.decode(outputs[0], skip_special_tokens=True)
        print(result)
    except Exception as e:
        logger.error("Failed to decode output", exc_info=True)
        raise e
    

    通过以上步骤,你应该能够找到并解决 ChatGLM4-9B 推理报错的问题。如果问题依然存在,建议查阅相关文档或寻求社区支持。

    评论

报告相同问题?

问题事件

  • 创建了问题 1月20日