Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 10/10 [00:11<00:00, 1.10s/it]
Some parameters are on the meta device because they were offloaded to the cpu.
Welcome to the GLM-4-9B CLI chat. Type your messages below.
You: hello
GLM-4:
Exception in thread Thread-2 (generate):
Traceback (most recent call last):
File "D:\conda\envs\env_name\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "D:\conda\envs\env_name\lib\threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "D:\conda\envs\env_name\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "D:\conda\envs\env_name\lib\site-packages\transformers\generation\utils.py", line 1622, in generate
result = self._sample(
File "D:\conda\envs\env_name\lib\site-packages\transformers\generation\utils.py", line 2841, in _sample
model_kwargs = self._update_model_kwargs_for_generation(
File "C:\Users\Administrator\.cache\huggingface\modules\transformers_modules\glm-4-9b-chat\modeling_chatglm.py", line 929, in _update_model_kwargs_for_generation
cache_name, cache = self._extract_past_from_model_output(outputs)
ValueError: too many values to unpack (expected 2)
Traceback (most recent call last):
File "E:\ChatGLM3\basic_demo\trans_cli_demo.py", line 112, in <module>
for new_token in streamer:
File "D:\conda\envs\env_name\lib\site-packages\transformers\generation\streamers.py", line 223, in __next__
value = self.text_queue.get(timeout=self.timeout)
File "D:\conda\envs\env_name\lib\queue.py", line 179, in get
raise Empty
_queue.Empty
transformers包换了4.40.0(官方要求), 4.40.2, 4.39.3都没用,不知怎么办了