Skip to content

magnusfroste/mlx-ml

Repository files navigation

MLX-ML: Using config.yaml with MLX-LM

This guide provides example terminal commands for using the config.yaml file with MLX-LM, based on the official MLX-LM LoRA documentation.

Installation: Setting up MLX-LM in a Virtual Environment

It is recommended to use a Python virtual environment for MLX-LM. To install mlx_lm in a .venv:

python3 -m venv .venv
source .venv/bin/activate
pip install mlx-lm
  • If you want the latest development version, you can install directly from GitHub:
    pip install git+https://github.com/ml-explore/mlx-lm.git

Dataset Preparation

To prepare your dataset for LoRA fine-tuning with MLX-LM:

  1. Choose a format: Supported formats include completions, text, and chat. See examples below.
  2. Create your data files:
    • For most use cases, create train.jsonl and valid.jsonl (and optionally test.jsonl).
    • Each line should be a valid JSON object in the chosen format.
  3. Place files in the data/ directory:
    • Example: data/train.jsonl, data/valid.jsonl
  4. Reference the data path in your config.yaml:
    • Use the relative path to your data directory or dataset name under the data: key.

Example: Completion Format

Each line in train.jsonl:

{"prompt": "What is the capital of France?", "completion": "Paris."}

Example: Text Format

Each line in train.jsonl:

{"text": "This is an example for the model."}
  • Ensure each example is on a single line (no line breaks within an example).
  • Extra keys in each JSON object will be ignored.
  • For more advanced formats (e.g., chat, tools), see the official documentation.

Fine-tuning with config.yaml

To fine-tune a model using your config.yaml file, run:

mlx_lm.lora --config config.yaml
  • The config file should specify model, data, and other parameters. Command-line flags override config values.

Generating and Chatting with Your Model

After fine-tuning, you can generate text or chat with your model using the trained adapters, or by fusing the adapters into a new model.

1. Generate Text with Adapters

To generate text using your fine-tuned adapters:

mlx_lm.generate --model <path_to_model> --adapter-path <path_to_adapters> --prompt "<your_model_prompt>"
  • Example:
    mlx_lm.generate --model mistralai/Mistral-7B-v0.1 --adapter-path adapters --prompt "Hello, world!"
  • If you use a config file, you can specify model and adapter-path there:
    mlx_lm.generate --config config.yaml --prompt "Hello, world!"

2. Chat with the Model (with Adapters)

To interact with your model in chat mode using the adapters:

mlx_lm.chat --model <path_to_model> --adapter-path <path_to_adapters>
  • Example:
    mlx_lm.chat --model mistralai/Mistral-7B-v0.1 --adapter-path adapters
  • You can also use a config file:
    mlx_lm.chat --config config.yaml

3. Fuse Adapters into a New Model

You can fuse the adapters into a new standalone model for easier deployment:

mlx_lm.fuse --model <path_to_model> --adapter-path <path_to_adapters> --output fused_model/
  • Example:
    mlx_lm.fuse --model mistralai/Mistral-7B-v0.1 --adapter-path adapters --output fused_model/
  • This will create a new model in the fused_model/ directory.

4. Generate or Chat with the Fused Model

To generate text with the fused model:

mlx_lm.generate --model fused_model/ --prompt "<your_model_prompt>"

To chat with the fused model:

mlx_lm.chat --model fused_model/
  • Replace <path_to_model>, <path_to_adapters>, config.yaml, or fused_model/ with your actual paths as needed.
  • For more options, see the help for each command (e.g., mlx_lm.generate --help, mlx_lm.chat --help).

Example config.yaml

Below is an example config.yaml for MLX-LM LoRA fine-tuning:

# The path to the local model directory or Hugging Face repo.
model: "mlx-community/Llama-3.2-1B-Instruct"

# Whether or not to train (boolean)
train: true

# The fine-tuning method: "lora", "dora", or "full".
fine_tune_type: lora

# Optimizer
optimizer: adamw
# optimizer_config:
#   adamw:
#     betas: [0.9, 0.98]
#     eps: 1e-6
#     weight_decay: 0.05
#   bias_correction: true

# Directory with {train, valid, test}.jsonl files or Hugging Face dataset name
data: "mlx-community/WikiSQL"

# The PRNG seed
seed: 0

# Number of layers to fine-tune
num_layers: 16

# Minibatch size.
batch_size: 4

# Iterations to train for.
iters: 1000

# Number of validation batches, -1 uses the entire validation set.
val_batches: 25

# Adam learning rate.
learning_rate: 1e-5

# Save/load path for the trained adapter weights.
adapter_path: "adapters"

# Save the model every N iterations.
save_every: 100

# Evaluate on the test set after training
test: false

# Number of test set batches, -1 uses the entire test set.
test_batches: 100

# Maximum sequence length.
max_seq_length: 2048

# Use gradient checkpointing to reduce memory use.
grad_checkpoint: false

# LoRA parameters can only be specified in a config file
lora_parameters:
  # The layer keys to apply LoRA to.
  # These will be applied for the last lora_layers
  keys: ["self_attn.q_proj", "self_attn.v_proj"]
  rank: 8
  scale: 20.0
  dropout: 0.0

Additional Notes

  • You can see all available options with:
    mlx_lm.lora --help
  • For more details on the config format, see the example YAML.
  • Local datasets should be in data/ as train.jsonl, valid.jsonl, and/or test.jsonl.

For more, see the MLX-LM LoRA documentation.

About

mlx-ml is the open-source repo I created to simplify MLX-LM workflows for LoRA fine-tuning, inference, and deployment of language models on Apple Silicon. Use a single config.yaml to define models, datasets, LoRA params (rank, scale, dropout), optimizers, and more — then run terminal commands for training, text generation, interactive chats, or ...

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors