Description
(pd) jyt@dsvl2:~/PaddleMIX$ sh deploy/deepseek_vl2/shell/run.sh
/home/jyt/anaconda3/envs/pd/lib/python3.10/site-packages/_distutils_hack/init.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
warnings.warn(
W0507 13:37:27.378628 127465 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.8, Runtime API Version: 11.8
W0507 13:37:27.379776 127465 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
import module error: fast_ln
import module error: fused_ln
Warning, FusedLn module is not available, use LayerNorm instead.
Warning, FusedLn module is not available, use LayerNorm instead.
[2025-05-07 13:37:30,940] [ WARNING] moe_lm.py:454 - grouped_gemm
is not installed, using sequential GEMM, which is slower.
modeling_internlm2 has_flash_attn is True.
modeling_intern_vit has_flash_attn is True.
[2025-05-07 13:37:30,959] [ WARNING] - No paddlenlp_ops.sm86
(…)eepseek-ai/deepseek-vl2-tiny/config.json: 100%|███████████████████████████████████████████████████████████████████████████| 2.28k/2.28k [00:00<00:00, 18.4MB/s]
[2025-05-07 13:37:31,096] [ INFO] - Loading configuration file /home/jyt/.paddlenlp/models/deepseek-ai/deepseek-vl2-tiny/config.json
(…)ek-vl2-tiny/model.safetensors.index.json: 100%|█████████████████████████████████████████████████████████████████████████████| 247k/247k [00:00<00:00, 13.1MB/s]
[2025-05-07 13:37:31,242] [ INFO] - Loading weights file from cache at /home/jyt/.paddlenlp/models/deepseek-ai/deepseek-vl2-tiny/model.safetensors.index.json
Downloading shards: 0%| | 0/1 [00:00<?, ?it/s]
(…)2-tiny/model-00001-of-000001.safetensors: 42%|████████████████████████ (…)2-tiny/model-00001-of-000001.safetensors: 100%|█████████████████████████████████████████████████████████████████████████| 6.74G/6.74G [1:32:37<00:00, 1.21MB/s]Downloading shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [1:32:38<00:00, 5558.01s/it]
[2025-05-07 15:10:22,508] [ INFO] - All model checkpoint weights were used when initializing DeepseekVLV2ForCausalLM.
[2025-05-07 15:10:22,508] [ INFO] - All the weights of DeepseekVLV2ForCausalLM were initialized from the model checkpoint at deepseek-ai/deepseek-vl2-tiny.
If your task is similar to the task the model of the checkpoint was trained on, you can already use DeepseekVLV2ForCausalLM for predictions without further training.
[2025-05-07 15:10:22,562] [ INFO] - Generation config file not found, using a generation config created from the model config.
(…)seek-ai/deepseek-vl2-tiny/tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████| 6.27M/6.27M [00:04<00:00, 1.50MB/s]
(…)eepseek-vl2-tiny/special_tokens_map.json: 100%|███████████████████████████████████████████████████████████████████████████████| 801/801 [00:00<00:00, 7.21MB/s]
(…)/deepseek-vl2-tiny/tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████| 165k/165k [00:00<00:00, 2.98MB/s]
[2025-05-07 15:10:27,833] [ INFO] - Loading configuration file /home/jyt/.paddlenlp/models/deepseek-ai/deepseek-vl2-tiny/config.json
Add pad token = ['<|▁pad▁|>'] to the tokenizer<|▁pad▁|>:2
Add image token = [''] to the tokenizer
:128815
[2025-05-07 15:10:27,983] [ INFO] - Assigning ['<|ref|>', '<|/ref|>', '<|det|>', '<|/det|>', '<|grounding|>'] to the additional_special_tokens key of the tokenizer
Add grounding-related tokens = ['<|ref|>', '<|/ref|>', '<|det|>', '<|/det|>', '<|grounding|>'] to the tokenizer with input_ids <|ref|>:128816 <|/ref|>:128817 <|det|>:128818 <|/det|>:128819 <|grounding|>:128820
[2025-05-07 15:10:27,985] [ INFO] - Assigning ['<|User|>', '<|Assistant|>'] to the additional_special_tokens key of the tokenizer
Add chat tokens = ['<|User|>', '<|Assistant|>'] to the tokenizer with input_ids
<|User|>:128821
<|Assistant|>:128822
/home/jyt/PaddleMIX/PaddleNLP/paddlenlp/generation/configuration_utils.py:250: UserWarning: using greedy search strategy. However, temperature
is set to 0.1
-- this flag is only used in sample-based generation modes. You should set decode_strategy="greedy_search"
or unset temperature
. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
/home/jyt/PaddleMIX/PaddleNLP/paddlenlp/generation/configuration_utils.py:255: UserWarning: using greedy search strategy. However, top_p
is set to 0.001
-- this flag is only used in sample-based generation modes. You should set decode_strategy="greedy_search"
or unset top_p
. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
Traceback (most recent call last):
File "/home/jyt/PaddleMIX/deploy/deepseek_vl2/deepseek_vl2_infer.py", line 244, in
fast_llm_model = AutoInferenceModelForCausalLM.from_pretrained(
File "/home/jyt/PaddleMIX/PaddleNLP/paddlenlp/transformers/auto/modeling.py", line 854, in from_pretrained
import_class = importlib.import_module(f"paddlenlp.experimental.transformers.{config.model_type}.modeling")
File "/home/jyt/anaconda3/envs/pd/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 992, in _find_and_load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 992, in _find_and_load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/home/jyt/PaddleMIX/PaddleNLP/paddlenlp/experimental/transformers/init.py", line 23, in
from .proposers import *
File "/home/jyt/PaddleMIX/PaddleNLP/paddlenlp/experimental/transformers/proposers.py", line 24, in
from paddlenlp_ops import (
ImportError: cannot import name 'draft_model_postprocess' from 'paddlenlp_ops' (/home/jyt/anaconda3/envs/pd/lib/python3.10/site-packages/paddlenlp_ops/init.py)
(pd) jyt@dsvl2:/PaddleMIX$ python -c "from paddlenlp_ops import draft_model_postprocess; print('Success')"/PaddleMIX$
/home/jyt/anaconda3/envs/pd/lib/python3.10/site-packages/_distutils_hack/init.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
warnings.warn(
[2025-05-07 15:16:56,104] [ WARNING] - No paddlenlp_ops.sm86
Traceback (most recent call last):
File "", line 1, in
ImportError: cannot import name 'draft_model_postprocess' from 'paddlenlp_ops' (/home/jyt/anaconda3/envs/pd/lib/python3.10/site-packages/paddlenlp_ops/init.py)
(pd) jyt@dsvl2: