Skip to content

Latest commit

 

History

History
385 lines (276 loc) · 11.4 KB

File metadata and controls

385 lines (276 loc) · 11.4 KB

MLX-ML - Product Requirements Document (PRD)

Version: 1.0
Last Updated: February 2026
Status: Active Development


Executive Summary

MLX-ML is a utility for fine-tuning large language models using MLX-LM with LoRA (Low-Rank Adaptation). It provides a config.yaml-based approach to model fine-tuning, making it easy to adapt models for specific tasks without full retraining.

Target Users

  • Primary: ML engineers fine-tuning models with LoRA
  • Secondary: Researchers experimenting with model adaptation
  • Tertiary: Developers wanting to customize LLMs for specific use cases

Unique Value Proposition

  • Config-Based: YAML configuration for easy fine-tuning
  • LoRA Efficient: Low-Rank Adaptation for memory-efficient tuning
  • Multiple Formats: Supports completions, text, and chat formats
  • Model Fusion: Fuse adapters into standalone models for deployment

1. Product Vision

MLX-ML aims to make model fine-tuning accessible through simple configuration files, enabling users to adapt large language models for specific tasks without extensive technical knowledge or expensive infrastructure.

Success Metrics

  • Fine-Tuning Success: Percentage of successful fine-tuning runs
  • Model Quality: Performance of fine-tuned models
  • Config Clarity: Ease of use of config.yaml
  • Documentation Quality: Clarity of examples and guides

2. Core Features

2.1 Config-Based Fine-Tuning

Priority: P0 (Critical)

Description: Users can fine-tune models using a simple YAML configuration file.

Requirements:

  • YAML config file support
  • Model specification
  • Data configuration
  • Training parameters
  • LoRA parameters

User Stories:

  • As an ML engineer, I want config-based fine-tuning so I can easily experiment
  • As a researcher, I want parameter configuration so I can optimize training
  • As a user, I want LoRA parameters so I can fine-tune efficiently

Technical Notes:

  • YAML parsing and validation
  • Parameter validation
  • Config examples
  • Error handling

2.2 Dataset Preparation

Priority: P0 (Critical)

Description: Users can prepare datasets in multiple formats for fine-tuning.

Requirements:

  • Completions format support
  • Text format support
  • Chat format support
  • Data validation

User Stories:

  • As a user, I want multiple format support so I can use existing datasets
  • As a user, I want data validation so I can ensure quality
  • As a user, I want format examples so I can get started quickly

Technical Notes:

  • Format validation
  • JSON parsing
  • Error reporting

2.3 Model Generation & Chat

Priority: P1 (High)

Description: Users can generate text and chat with fine-tuned models.

Requirements:

  • Text generation with adapters
  • Chat mode with adapters
  • Model fusion for deployment
  • Prompt-based generation

User Stories:

  • As a developer, I want text generation so I can test my model
  • As a user, I want chat mode so I can interact with my model
  • As a deployer, I want model fusion so I can deploy easily

Technical Notes:

  • Adapter loading
  • Chat interface
  • Model fusion
  • Prompt handling

3. Technical Architecture

3.1 Technology Stack

Layer Technology Purpose
Tool MLX-LM Fine-tuning framework
Language Python 3.11+ Scripting language
Models Hugging Face Models Pre-trained models
Data Formats JSONL Dataset format
Configuration YAML Configuration files

3.2 System Architecture

┌─────────────────────────────────────────────────────────────┐
│                    MLX-ML Utility                          │
│  ┌─────────────────────────────────────────────────────────┐│
│  │  Python Scripts (Python 3.11+)                          ││
│  │  - config.yaml Parser                                   ││
│  │  - Dataset Preparation                                 ││
│  │  - Fine-Tuning Execution                                ││
│  │  - Model Generation                                    ││
│  │  - Chat Interface                                      ││
│  │  - Model Fusion                                        ││
│  └─────────────────────────────────────────────────────────┘│
│                            │                                 │
│                            ▼                                 │
│  ┌─────────────────────────────────────────────────────────┐│
│  │  MLX-LM Framework                                     ││
│  │  - LoRA Fine-Tuning                                    ││
│  │  - Model Loading                                        ││
│  │  - Adapter Management                                  ││
│  │  - Training Loop                                        ││
│  └─────────────────────────────────────────────────────────┘│
│                            │                                 │
│                            ▼                                 │
│  ┌─────────────────────────────────────────────────────────┐│
│  │  Hugging Face Models                                    ││
│  │  - Pre-trained Models                                    ││
│  │  - Model Downloading                                    ││
│  │  - Model Loading                                        ││
│  └─────────────────────────────────────────────────────────┘│
│                            │                                 │
│                            ▼                                 │
│  ┌─────────────────────────────────────────────────────────┐│
│  │  Dataset Files (JSONL)                                   ││
│  │  - train.jsonl                                           ││
│  │  - valid.jsonl                                           ││
│  │  - test.jsonl                                            ││
│  └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

3.3 Data Flow

Fine-Tuning Flow:

  1. User creates config.yaml
  2. User prepares dataset files
  3. User runs mlx_lm.lora --config config.yaml
  4. Config parsed and validated
  5. Model loaded from Hugging Face
  6. Dataset loaded and validated
  7. Training begins
  8. Adapters saved periodically
  9. Training completes

Generation Flow:

  1. User runs mlx_lm.generate --config config.yaml --prompt "..."
  2. Config parsed for model and adapter paths
  3. Model and adapters loaded
  4. Prompt processed
  5. Text generated
  6. Response returned

4. User Experience

4.1 Onboarding

First-Time User Experience:

  1. Documentation

    • Quick start guide
    • Config examples
    • Dataset preparation guide
  2. Examples

    • Example config.yaml
    • Example datasets
    • Generation examples

4.2 Daily Use

Typical Session:

  1. User prepares dataset
  2. User creates config.yaml
  3. User runs fine-tuning command
  4. User monitors training progress
  5. User tests fine-tuned model
  6. User generates text or chats with model

4.3 Error States

Graceful Degradation:

  • Invalid config: "Ogiltig config.yaml. Kontrollera syntax och parametrar."
  • Data error: "Dataset-filen kunde inte hittas. Kontrollera format."
  • Model error: "Kunde inte ladda modellen. Kontrollera sökväg."

5. Roadmap

Phase 1: MVP (Current)

  • ✅ Config-based fine-tuning
  • ✅ Dataset preparation
  • ✅ Model generation
  • ✅ Chat interface
  • ✅ Model fusion

Phase 2: Enhanced Experience (Q1 2026)

  • 🔄 Advanced config options
  • 🔄 Multiple model support
  • 🔄 Batch processing
  • 🔄 Progress tracking

Phase 3: Advanced Features (Q2 2026)

  • 📝 Distributed training
  • 🔍 Hyperparameter tuning
  • 🏆 Model evaluation
  • 🤖 Integration with MLOps workflows

6. Success Criteria

Technical

  • Config parsing accuracy > 95%
  • Fine-tuning success rate > 90%
  • Model loading success rate > 95%
  • Generation response time < 2 seconds

User Experience

  • Config clarity > 4.5/5
  • Documentation completeness > 90%
  • Error message clarity > 4/5
  • Example quality > 4.5/5

Business

  • 50+ fine-tuning runs per week
  • 10+ models fine-tuned per week
  • 90% fine-tuning success rate
  • 95% model generation success

7. Risks & Mitigations

Risk 1: Model Loading

Risk: Hugging Face model download failures

Mitigation:

  • Retry logic
  • Error handling
  • User guidance

Risk 2: Dataset Quality

Risk: Poor quality datasets affecting fine-tuning

Mitigation:

  • Data validation
  • Quality checks
  • User warnings

Risk 3: Training Failures

Risk: Training crashes or timeouts

Mitigation:

  • Progress saving
  • Error recovery
  • User notifications

8. Dependencies

Libraries

  • mlx-lm: Fine-tuning framework
  • huggingface_hub: Model downloading

Platforms

  • Hugging Face: Model repository
  • Local File System: Dataset storage

9. Appendix

A. Environment Setup

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install MLX-LM
pip install mlx-lm

# For latest development version
pip install git+https://github.com/ml-explore/mlx-lm.git

B. Installation Instructions

# Clone the repository
git clone https://github.com/magnusfroste/mlx-ml.git
cd mlx-ml

# Install dependencies
pip install -r requirements.txt

# Create dataset files
# Place train.jsonl, valid.jsonl, test.jsonl in data/ directory

# Create config.yaml
# See example config.yaml for reference

# Run fine-tuning
mlx_lm.lora --config config.yaml

# Generate text with adapters
mlx_lm.generate --config config.yaml --prompt "Hello, world!"

# Chat with model with adapters
mlx_lm.chat --config config.yaml

# Fuse adapters into new model
mlx_lm.fuse --config config.yaml --output fused_model/

C. Config Examples

See config.yaml for a complete example configuration with all available parameters.


Document History:

Version Date Changes Author
1.0 Feb 2026 Initial PRD creation Magnus Froste

License: MIT - See LICENSE file for details