-
Notifications
You must be signed in to change notification settings - Fork 223
Open
Description
系统:Ubuntu 20.04.6 LTS
显卡:NVIDIA GeForce RTX 3090 * 4,CUDA=12.4
Paddle 环境:paddlemix 0.1.0;paddlenlp 3.0.0b3;paddlepaddle-gpu 3.0.0rc
运行代码:
# test.py
def main():
strategy = fleet.DistributedStrategy()
strategy.hybrid_configs = {
"dp_degree": 1,
"mp_degree": 4,
"pp_degree": 1,
"sharding_degree": 1,
}
fleet.init(is_collective=True, strategy=strategy)
hcg = fleet.get_hybrid_communicate_group()
tensor_parallel_rank = hcg.get_model_parallel_rank()
paddle.seed(seed=0)
model_name = "lmms-lab/llava-onevision-qwen2-7b-si"
compute_dtype = "float16"
model = LlavaQwenForCausalLM.from_pretrained(model_name, tensor_parallel_degree=4, tensor_parallel_rank=tensor_parallel_rank, dtype=compute_dtype).eval() # 此处报错
tokenizer = Qwen2Tokenizer.from_pretrained(model_name)
image_processor = SigLipImageProcessor()执行指令:python -m paddle.distributed.launch --gpus="0,1,2,3" test.py
读入时发生以下报错,显示模型大小不匹配:
Loading checkpoint shards: 100%|██████████| 4/4 [01:56<00:00, 29.08s/it]
Traceback (most recent call last):
File "/home/mengxy/Chat/PaddleMIX/test.py", line 75, in <module>
main()
File "/home/mengxy/Chat/PaddleMIX/test.py", line 40, in main
model = LlavaQwenForCausalLM.from_pretrained(model_name, tensor_parallel_degree=4, tensor_parallel_rank=tensor_parallel_rank, dtype=compute_dtype).eval()
File "/home/mengxy/anaconda3/envs/paddlemix/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py", line 2529, in from_pretrained
model, missing_keys, unexpected_keys, mismatched_keys = cls._load_pretrained_model(
File "/home/mengxy/anaconda3/envs/paddlemix/lib/python3.10/site-packages/paddlenlp/transformers/model_utils.py", line 2216, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for LlavaQwenForCausalLM:
Skip loading for qwen2.embed_tokens.weight. qwen2.embed_tokens.weight receives a shape [152064, 3584], but the expected shape is [38016, 3584].
Skip loading for qwen2.layers.0.self_attn.q_proj.weight. qwen2.layers.0.self_attn.q_proj.weight receives a shape [3584, 3584], but the expected shape is [3584, 896].
Skip loading ......
希望知道如何设置代码,谢谢!
Metadata
Metadata
Assignees
Labels
No labels