Skip to content

[Bug]: GLM-4.1V-Thinking ValueError #21811

@XiaYijun124

Description

@XiaYijun124

Your current environment

vllm==0.9.2. the model weights from https://huggingface.co/zai-org/GLM-4.1V-9B-Thinking/tree/main

🐛 Describe the bug

bug:
ERROR 07-29 06:54:54 [core.py:588] EngineCore encountered a fatal error.
ERROR 07-29 06:54:54 [core.py:588] Traceback (most recent call last):
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 579, in run_engine_core
ERROR 07-29 06:54:54 [core.py:588] engine_core.run_busy_loop()
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 606, in run_busy_loop
ERROR 07-29 06:54:54 [core.py:588] self._process_engine_step()
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 631, in _process_engine_step
ERROR 07-29 06:54:54 [core.py:588] outputs, model_executed = self.step_fn()
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 235, in step
ERROR 07-29 06:54:54 [core.py:588] model_output = self.execute_model(scheduler_output)
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 221, in execute_model
ERROR 07-29 06:54:54 [core.py:588] raise err
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/core.py", line 212, in execute_model
ERROR 07-29 06:54:54 [core.py:588] return self.model_executor.execute_model(scheduler_output)
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/v1/executor/abstract.py", line 87, in execute_model
ERROR 07-29 06:54:54 [core.py:588] output = self.collective_rpc("execute_model",
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/executor/uniproc_executor.py", line 57, in collective_rpc
ERROR 07-29 06:54:54 [core.py:588] answer = run_method(self.driver_worker, method, args, kwargs)
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/utils/init.py", line 2736, in run_method
ERROR 07-29 06:54:54 [core.py:588] return func(*args, **kwargs)
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 07-29 06:54:54 [core.py:588] return func(*args, **kwargs)
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/v1/worker/gpu_worker.py", line 308, in execute_model
ERROR 07-29 06:54:54 [core.py:588] output = self.model_runner.execute_model(scheduler_output,
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 07-29 06:54:54 [core.py:588] return func(*args, **kwargs)
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 1332, in execute_model
ERROR 07-29 06:54:54 [core.py:588] inputs_embeds = self.model.get_input_embeddings(
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/glm4_1v.py", line 1474, in get_input_embeddings
ERROR 07-29 06:54:54 [core.py:588] inputs_embeds = merge_multimodal_embeddings(
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/utils.py", line 505, in merge_multimodal_embeddings
ERROR 07-29 06:54:54 [core.py:588] return _merge_multimodal_embeddings(
ERROR 07-29 06:54:54 [core.py:588] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/utils.py", line 427, in _merge_multimodal_embeddings
ERROR 07-29 06:54:54 [core.py:588] raise ValueError(
ERROR 07-29 06:54:54 [core.py:588] ValueError: Attempted to assign 646 = 646 multimodal tokens to 1292 placeholders

vllm==0.9.2

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions