fix: merge config.json eos_token_id with tokenizer-derived value instead of replacing it by EmreCelenli · Pull Request #1414 · ml-explore/mlx-lm

EmreCelenli · 2026-06-18T18:48:40Z

Summary

TokenizerWrapper.__init__ treats eos_token_ids passed in from config.json
as a hard replacement for whatever the tokenizer itself correctly derives.
When config.json has a stale value (common in third-party MLX-quantized repos),
the correct stop token is silently discarded, causing generation to never stop
at the right boundary. The real stop token (e.g. <|im_end|>) leaks into output
as plain text and the model keeps generating until max tokens.

Root Cause

In utils.py, both load() and sharded_load() do:

tokenizer = load_tokenizer(
    model_path, tokenizer_config, eos_token_ids=config.get("eos_token_id", None)
)

And TokenizerWrapper.__init__ treats this as a strict replacement:

self._eos_token_ids = (
    set(eos_token_ids)
    if eos_token_ids is not None
    else {tokenizer.eos_token_id}
)

So if config.json has a stale value, the tokenizer's own correct derivation
is silently thrown away with no warning.

Fix

if eos_token_ids is not None:
    self._eos_token_ids = set(
        [eos_token_ids] if isinstance(eos_token_ids, int) else eos_token_ids
    )
else:
    self._eos_token_ids = set()
if tokenizer.eos_token_id is not None:
    self._eos_token_ids.add(tokenizer.eos_token_id)

Verified

Model: mlx-community/Qwen2.5-Coder-7B-Instruct-4bit, 16GB M1 Mac.
config.json has eos_token_id: 151643 (<|endoftext|>, stale pretraining EOS).
Tokenizer correctly derives 151645 (<|im_end|>).

# Before fix
Via load(): {151643}          # correct value silently replaced, generation never stops

# After fix
Via load(): {151643, 151645}  # both present, generation stops correctly

Real-world symptom: <|im_end|> no longer leaks into output when using
mlx_lm.server with this model. Generation speed returns to normal since
the model stops at the correct boundary instead of running to max tokens.

Tests

203 tests pass. Black formatting check passed.

…ead of replacing it

fix: merge config.json eos_token_id with tokenizer-derived value inst…

50eb7d2

…ead of replacing it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: merge config.json eos_token_id with tokenizer-derived value instead of replacing it#1414

fix: merge config.json eos_token_id with tokenizer-derived value instead of replacing it#1414
EmreCelenli wants to merge 1 commit into
ml-explore:mainfrom
EmreCelenli:fix/config-eos-override-silent-replacement

EmreCelenli commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EmreCelenli commented Jun 18, 2026

Summary

Root Cause

Fix

Verified

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant