Skip to content

Add Laguna M.1 model#1415

Open
eauchs wants to merge 2 commits into
ml-explore:mainfrom
eauchs:add-laguna-m1
Open

Add Laguna M.1 model#1415
eauchs wants to merge 2 commits into
ml-explore:mainfrom
eauchs:add-laguna-m1

Conversation

@eauchs

@eauchs eauchs commented Jun 18, 2026

Copy link
Copy Markdown

Adds MLX support for Poolside Laguna M.1 (Apache 2.0), a 225B-total / 23B-active MoE for agentic coding.

mlx_lm/models/laguna.py implements:

  • MoE: 256 experts + 1 shared, top-k=16, sigmoid routing (with e_score_correction_bias), via SwitchGLU; first 3 layers dense SwiGLU, remaining 67 sparse.
  • Attention: GQA (64 Q / 8 KV heads), per-head QK-norm, softplus attention output gating (g_proj), RoPE + YaRN.
  • sanitize() handles the original HF layout: FP8 (compressed-tensors) dequant, e_score_correction_bias remap, per-expert → SwitchGLU stacking. quant_predicate keeps the router gate full-precision.

Tested end-to-end: converted from poolside/Laguna-M.1-FP8 and ran a 3-bit build locally on an M3 Max 128 GB (~26 tok/s, ~100 GB peak). Quantized MLX weights: ox-ox/Laguna-M.1-MLX-Q3.

Models like Poolside Laguna M.1 emit <tool_call>name\n<arg_key>..., so the
parser captured the name up to <arg_key> including the trailing newline
(e.g. "get_weather\n"), breaking tool dispatch in OpenAI-compatible clients.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant