Support `output_logits='generation'` and `output_last_hidden_state` in PyTorch backend by Copilot · Pull Request #4534 · InternLM/lmdeploy

Copilot · 2026-04-17T07:16:14Z

Motivation

The PyTorch engine silently discarded output_logits='generation' and output_last_hidden_state requests, returning None regardless. Only output_logits='all' with max_new_tokens=0 was supported. This PR enables per-step logit and hidden-state collection during generation.

Modification

messages.py: Add out_logits_mode / out_last_hidden_states_mode to SamplingParam; update from_gen_config to pass 'generation' mode through (keep 'all' restricted to max_new_tokens=0). Add HistoryHiddenStates class (same int16 bit-reinterpretation storage as HistoryLogits) plus all_hidden_states, return_hidden_states, hidden_states_generation_mode, hidden_states, and append_hidden_states on SchedulerSequence.
engine.py: Add last_hidden_state field to InferOutput.
model_agent/agent.py: Add hidden_states to BatchedOutputs; update _async_model_forward to extract last-position hidden states ('generation') or full-sequence hidden states ('all'); thread return_hidden_states / hidden_states_all_mode through _async_step and _step_postprocess_with_output.
inputs_maker.py: Compute return_hidden_states and hidden_states_all_mode flags per batch.
engine_loop.py: In _make_infer_outputs, accumulate last-position logits/hidden-states at each step for 'generation' mode and emit on finish; handle 'all' mode split by sequence length. Include last_hidden_state in _send_resp.
engine_instance.py: Read last_hidden_state from response data and forward to EngineOutput.

Use cases (Optional)

from lmdeploy import pipeline, GenerationConfig

pipe = pipeline('Qwen/Qwen3-VL-7B-Instruct')
gen_config = GenerationConfig(
    temperature=0.0,
    top_k=1,
    output_logits='generation',
    output_last_hidden_state='generation',
    max_new_tokens=128,
)
responses = pipe(['Hi, introduce yourself', 'Shanghai is'], gen_config=gen_config)
hidden_states = [r.last_hidden_state for r in responses]  # [num_steps, hidden_dim]
logits = [r.logits for r in responses]                    # [num_steps, vocab_size]

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
The documentation has been modified accordingly, like docstring or example tutorials.

…Torch engine Agent-Logs-Url: https://github.com/InternLM/lmdeploy/sessions/84914fd8-47c1-4c72-80f4-88255925953e Co-authored-by: CUHKSZzxy <46674730+CUHKSZzxy@users.noreply.github.com>

…rity Agent-Logs-Url: https://github.com/InternLM/lmdeploy/sessions/84914fd8-47c1-4c72-80f4-88255925953e Co-authored-by: CUHKSZzxy <46674730+CUHKSZzxy@users.noreply.github.com>

Agent-Logs-Url: https://github.com/InternLM/lmdeploy/sessions/84914fd8-47c1-4c72-80f4-88255925953e Co-authored-by: CUHKSZzxy <46674730+CUHKSZzxy@users.noreply.github.com>

Initial plan

bab83c0

Copilot AI assigned Copilot and CUHKSZzxy Apr 17, 2026

Copilot started work on behalf of CUHKSZzxy April 17, 2026 07:16 View session

Copilot AI linked an issue Apr 17, 2026 that may be closed by this pull request

[Bug] 为什么qwen3VL的logits、last_hidden_state不支持返回呢？ #4479

Open

3 tasks

Copilot AI and others added 3 commits April 17, 2026 07:46

Support output_logits='generation' and output_last_hidden_state in Py…

86966c5

…Torch engine Agent-Logs-Url: https://github.com/InternLM/lmdeploy/sessions/84914fd8-47c1-4c72-80f4-88255925953e Co-authored-by: CUHKSZzxy <46674730+CUHKSZzxy@users.noreply.github.com>

Fix missing HistoryMropePosIds class declaration and improve code cla…

1db4500

…rity Agent-Logs-Url: https://github.com/InternLM/lmdeploy/sessions/84914fd8-47c1-4c72-80f4-88255925953e Co-authored-by: CUHKSZzxy <46674730+CUHKSZzxy@users.noreply.github.com>

Add clarifying comment to HistoryHiddenStates class

fb7c777

Agent-Logs-Url: https://github.com/InternLM/lmdeploy/sessions/84914fd8-47c1-4c72-80f4-88255925953e Co-authored-by: CUHKSZzxy <46674730+CUHKSZzxy@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix logits and last_hidden_state returns for qwen3VL~~ Support output_logits='generation' and output_last_hidden_state in PyTorch backend Apr 17, 2026

Copilot AI requested a review from CUHKSZzxy April 17, 2026 07:53

Copilot finished work on behalf of CUHKSZzxy April 17, 2026 07:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `output_logits='generation'` and `output_last_hidden_state` in PyTorch backend#4534

Support `output_logits='generation'` and `output_last_hidden_state` in PyTorch backend#4534
Copilot wants to merge 4 commits intomainfrom
copilot/fix-logits-last-hidden-state-issue

Copilot AI commented Apr 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

Use cases (Optional)

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 17, 2026 •

edited

Loading