Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 10/10 [00:11<00:00, 1.10s/it]
Some parameters are on the meta device because they were offloaded to the cpu.
Welcome to the GLM-4-9B CLI chat. Type your messages below.
You: hello
GLM-4:
Exception in thread Thread-2 (generate):
Traceback (most recent call last):
File "D:\conda\envs\env_name\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "D:\conda\envs\env_name\lib\threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "D:\conda\envs\env_name\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "D:\conda\envs\env_name\lib\site-packages\transformers\generation\utils.py", line 1622, in generate
result = self._sample(
File "D:\conda\envs\env_name\lib\site-packages\transformers\generation\utils.py", line 2841, in _sample
model_kwargs = self._update_model_kwargs_for_generation(
File "C:\Users\Administrator\.cache\huggingface\modules\transformers_modules\glm-4-9b-chat\modeling_chatglm.py", line 929, in _update_model_kwargs_for_generation
cache_name, cache = self._extract_past_from_model_output(outputs)
ValueError: too many values to unpack (expected 2)
transformers包换了4.40.0(官方要求), 4.40.2, 4.39.3都没用,不知怎么办了

另外outputs.past_key_values的值是不是有问题?