Include context_length in /v1/models response (#1183) by seikixtc · Pull Request #1184 · ml-explore/mlx-lm

seikixtc · 2026-04-23T12:50:46Z

Closes #1183.

What

Adds a context_length field to every entry returned by the /v1/models endpoint (and the /v1/models/{repo_id} single-model variant), reporting the maximum context length the model declares in its config.json.

Example response after this change:

{
  "object": "list",
  "data": [
    {
      "id": "mlx-community/Qwen2.5-7B-Instruct-4bit",
      "object": "model",
      "created": 1745000000,
      "context_length": 32768
    }
  ]
}

Why

OpenAI-compatible clients increasingly need to know a model's usable context window before constructing a request — for chunking long documents, sizing prompt caches, choosing between models, and rendering token-budget UIs. Today the endpoint exposes only id, object, and created, so clients have to hard-code limits or guess.

How

Adds _get_context_length(config_path) to read the first recognized max-context field from config.json
Supports max_position_embeddings, n_positions, max_sequence_length, and seq_length
Adds _find_repo_config_path(repo) so Hugging Face cache scans use the top-level snapshot config.json rather than any nested file with the same basename
Returns None instead of raising if the config is missing, malformed, or has no valid positive integer context field
Includes context_length in both cached-model listings and local --model listings

Tests

New unit tests for _get_context_length cover recognized fields, priority ordering, missing fields, malformed JSON, missing files, non-positive values, string values, bool values, and None input
New unit tests for _find_repo_config_path cover top-level-vs-nested config.json selection in Hugging Face cache snapshots
test_handle_models now asserts every model entry includes context_length, with value either None or a positive integer

Verification

python3 -m unittest discover -s tests -p 'test_server.py' -v
pre-commit run --files mlx_lm/server.py tests/test_server.py
git diff --check

Backward compatibility

Purely additive. Existing clients that only read id / object / created continue to work unchanged.

Include context_length in /v1/models response (ml-explore#1183)

2c23faa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include context_length in /v1/models response (#1183)#1184

Include context_length in /v1/models response (#1183)#1184
seikixtc wants to merge 1 commit intoml-explore:mainfrom
seikixtc:1183-context-length

seikixtc commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

seikixtc commented Apr 23, 2026

What

Why

How

Tests

Verification

Backward compatibility

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant