Add support for ministral3 and mistral3 model types#860
Add support for ministral3 and mistral3 model types#860sealad886 wants to merge 8 commits intoml-explore:mainfrom
ministral3 and mistral3 model types#860Conversation
Co-authored-by: sealad886 <155285242+sealad886@users.noreply.github.com>
Co-authored-by: sealad886 <155285242+sealad886@users.noreply.github.com>
Co-authored-by: sealad886 <155285242+sealad886@users.noreply.github.com>
…prove tests, fix GGUF export Co-authored-by: sealad886 <155285242+sealad886@users.noreply.github.com>
…tization and GGUF export
There was a problem hiding this comment.
Pull request overview
This PR adds recognition and handling for the ministral3 and mistral3 model types across quantization, GGUF export gating, and unit tests so these model variants can be instantiated and processed consistently within mlx_lm.
Changes:
- Added
ministral3/mistral3entries to AWQ model configuration mapping. - Updated GGUF export gating to apply
MODEL_REMAPPINGand allowministral3. - Added unit tests for
ministral3andmistral3model construction and prompt-cache creation.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
tests/test_models.py |
Adds coverage for ministral3 and mistral3 instantiation plus prompt-cache construction. |
mlx_lm/quant/awq.py |
Extends AWQ configuration mapping to support the new model types (including language_model nesting for mistral3). |
mlx_lm/fuse.py |
Uses MODEL_REMAPPING when checking GGUF export support and expands the allowlist. |
Comments suppressed due to low confidence (1)
mlx_lm/fuse.py:103
convert_to_gguf()derives RoPE metadata from top-levelconfig["rope_theta"]/config["rope_scaling"], butministral3usesrope_parameters(seemlx_lm/models/ministral3.py) withrope_thetanested under that dict. As-is, GGUF export forministral3will ignore non-defaultrope_parameters["rope_theta"]and any associated scaling params, producing incorrect metadata for some checkpoints. Consider normalizing the config before callingconvert_to_gguf()(e.g., copyrope_parameters["rope_theta"]intorope_theta, and map any scaling fields as needed).
# Apply MODEL_REMAPPING to match load() behavior
remapped_type = MODEL_REMAPPING.get(model_type, model_type)
if remapped_type not in ["llama", "mixtral", "mistral", "ministral3"]:
raise ValueError(
f"Model type {model_type} not supported for GGUF conversion."
)
weights = dict(tree_flatten(model.parameters()))
convert_to_gguf(save_path, weights, config, str(save_path / args.gguf_path))
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| model_type = config["model_type"] | ||
| if model_type not in ["llama", "mixtral", "mistral"]: | ||
| # Apply MODEL_REMAPPING to match load() behavior | ||
| remapped_type = MODEL_REMAPPING.get(model_type, model_type) | ||
| if remapped_type not in ["llama", "mixtral", "mistral", "ministral3"]: | ||
| raise ValueError( | ||
| f"Model type {model_type} not supported for GGUF conversion." |
There was a problem hiding this comment.
The PR description says GGUF export supports both ministral3 and mistral3, but this allowlist only adds ministral3. If mistral3 is intended to be supported, GGUF conversion likely also needs special handling because mistral3 configs are nested under text_config and parameter names are prefixed with language_model., which convert_to_gguf()/translate_weight_names() won’t match. Either add the missing mistral3 support end-to-end (flatten config + strip/translate weight prefixes) or update the PR description to reflect that only ministral3 is supported for GGUF export.
| self.model_test_runner( | ||
| model, args.model_type, text_config["vocab_size"], text_config["num_hidden_layers"] | ||
| ) |
There was a problem hiding this comment.
This model_test_runner(...) call exceeds Black’s default line length and will be reformatted by the pre-commit hook (which runs in CI). Please run pre-commit/Black so the arguments are wrapped consistently with the rest of the file.
This pull request adds support for the new
ministral3andmistral3model types across the codebase, ensuring they are properly recognized, handled in quantization, and thoroughly tested. The changes also improve consistency in model type remapping and GGUF export logic.Model support and integration:
ministral3andmistral3model types to the AWQ quantization configuration, ensuring these models are supported during quantization and mapped to the correct configuration (mlx_lm/quant/awq.py).MODEL_REMAPPINGfor consistent model type handling and includedministral3andmistral3as supported types (mlx_lm/fuse.py). [1] [2]Testing:
ministral3andmistral3models, including model instantiation and prompt cache construction, to ensure correct behavior and compatibility (tests/test_models.py).