Skip to content

LLaVA-onevision-qwen2-7b 模型并行多卡推理报错 #1084

@GraygoodsEiko

Description

@GraygoodsEiko

系统:Ubuntu 20.04.6 LTS
显卡:NVIDIA GeForce RTX 3090 * 4,CUDA=12.4
Paddle 环境:paddlemix 0.1.0;paddlenlp 3.0.0b3;paddlepaddle-gpu 3.0.0rc

运行代码:

# test.py
def main():
    strategy = fleet.DistributedStrategy()
    strategy.hybrid_configs = {
        "dp_degree": 1,
        "mp_degree": 4,
        "pp_degree": 1,
        "sharding_degree": 1,
    }
    fleet.init(is_collective=True, strategy=strategy)
    hcg = fleet.get_hybrid_communicate_group()
    tensor_parallel_rank = hcg.get_model_parallel_rank()
    paddle.seed(seed=0)

    model_name = "lmms-lab/llava-onevision-qwen2-7b-si"
    compute_dtype = "float16"

    model = LlavaQwenForCausalLM.from_pretrained(model_name, tensor_parallel_degree=4, tensor_parallel_rank=tensor_parallel_rank, dtype=compute_dtype).eval() # 此处报错
    tokenizer = Qwen2Tokenizer.from_pretrained(model_name)
    image_processor = SigLipImageProcessor()

执行指令:python -m paddle.distributed.launch --gpus="0,1,2,3" test.py

读入时发生以下报错,显示模型大小不匹配:

Loading checkpoint shards: 100%|██████████| 4/4 [01:56<00:00, 29.08s/it]
Traceback (most recent call last):
  File "/home/mengxy/Chat/PaddleMIX/test.py", line 75, in <module>
    main()
  File "/home/mengxy/Chat/PaddleMIX/test.py", line 40, in main
    model = LlavaQwenForCausalLM.from_pretrained(model_name, tensor_parallel_degree=4, tensor_parallel_rank=tensor_parallel_rank, dtype=compute_dtype).eval()
  File "/home/mengxy/anaconda3/envs/paddlemix/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py", line 2529, in from_pretrained
    model, missing_keys, unexpected_keys, mismatched_keys = cls._load_pretrained_model(
  File "/home/mengxy/anaconda3/envs/paddlemix/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py", line 2216, in _load_pretrained_model
    raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for LlavaQwenForCausalLM:
        Skip loading for qwen2.embed_tokens.weight. qwen2.embed_tokens.weight receives a shape [152064, 3584], but the expected shape is [38016, 3584].
        Skip loading for qwen2.layers.0.self_attn.q_proj.weight. qwen2.layers.0.self_attn.q_proj.weight receives a shape [3584, 3584], but the expected shape is [3584, 896].
        Skip loading ......

希望知道如何设置代码,谢谢!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions