Skip to content

Pull requests: ml-explore/mlx-lm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Bump the patch version
#959 opened Mar 7, 2026 by angeloskath Loading…
Add --served-model-name flag to server
#947 opened Mar 3, 2026 by otarkhan Loading…
2 tasks done
Support KV cache quantization with continuous batching
#941 opened Mar 2, 2026 by ochafik Loading…
3 tasks done
feat: add context_length to /v1/models response
#933 opened Feb 27, 2026 by lichengzhe Loading…
Hybrid cache for Qwen3.5
#923 opened Feb 23, 2026 by dexloom Loading…
3 tasks
Add --q-override for per-layer quantization
#922 opened Feb 23, 2026 by spicyneuron Loading…
Add MTP support for Step 3.5 Flash
#901 opened Feb 16, 2026 by janhilgard Loading…
5 of 6 tasks
Add per-sequence trim support to BatchKVCache
#873 opened Feb 11, 2026 by 0xDaizz Loading…
Fix dynamic_quant for MoE and VL models
#870 opened Feb 10, 2026 by Taderich73 Loading…
3 tasks done
server: log bad tool_calls from model
#850 opened Feb 6, 2026 by percontation Loading…
Add support for LongCat ZigZag Attention
#802 opened Jan 23, 2026 by kernelpool Loading…
Add Zamba2
#724 opened Jan 4, 2026 by proazr Loading…
ProTip! What’s not been updated in a month: updated:<2026-02-07.