-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Is your feature request related to a problem? Please describe.
Hello, I am trying to do a conversion of a very largo model. Qwen3-coder-480B. The conversion is only using the CPU, which is desired since the model cannot fit into a single GPU. The issue is that the GPU nodes are capped at ~1.5TB of RAM. This is enough to load the model and do the conversion, but fails when writing from running out of RAM.
Describe the solution you'd like
There are nodes with large ram 4TB+, but without GPUs. This fails because the megatron initialization requires a GPU. Is there a way the conversion can support CPU only systems?
Describe alternatives you've considered
Tried megatron-bridge
, nemo llm import
, and ms-swift
conversion tool. All seem to have the same GPU requirement.