perf: avoid output_hidden_states when only last_hidden_state is used #4755

ciaoyizhen · 2025-12-27T10:00:38Z

What does this PR do?

Avoid unnecessary output_hidden_states=True when only the final layer hidden state is used
to compute reward logits.

Currently, the model enables output_hidden_states=True but only consumes
hidden_states[-1]. This forces the backbone to store all intermediate layer
activations, increasing memory usage and compute overhead.

This PR switches to output.last_hidden_state, which is functionally equivalent
to hidden_states[-1], while keeping the behavior unchanged.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Anyone in the community is welcome to review this PR.

ciaoyizhen · 2025-12-27T10:02:39Z

from transformers import AutoModel, AutoTokenizer, AutoModelForSequenceClassification

reward_model_ckpt = "/data1/caoyizhen/caoyizhen/open-source-models/internlm2-1_8b-reward"

model = AutoModel.from_pretrained(reward_model_ckpt, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(reward_model_ckpt, trust_remote_code=True)
print(model.base_model_prefix)

a = "你好"
input_dict = tokenizer(a, return_tensors="pt")
output = model.model(
    **input_dict,
    return_dict=True,
    output_hidden_states=True,
    use_cache=False,  # otherwise mistral-based RM would error out
)
print(output.hidden_states[-1])


output = model.model(
    **input_dict,
    return_dict=True,
    output_hidden_states=False,
    use_cache=True,  # otherwise mistral-based RM would error out
)
print(output.last_hidden_state)

It is same.

perf: avoid output_hidden_states when only last_hidden_state is used

ecebafd

ciaoyizhen force-pushed the perf/use-last-hidden-state branch from f4367b7 to ecebafd Compare December 28, 2025 15:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: avoid output_hidden_states when only last_hidden_state is used #4755

perf: avoid output_hidden_states when only last_hidden_state is used #4755

Uh oh!

ciaoyizhen commented Dec 27, 2025

Uh oh!

ciaoyizhen commented Dec 27, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

perf: avoid output_hidden_states when only last_hidden_state is used #4755

Are you sure you want to change the base?

perf: avoid output_hidden_states when only last_hidden_state is used #4755

Uh oh!

Conversation

ciaoyizhen commented Dec 27, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

ciaoyizhen commented Dec 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ciaoyizhen commented Dec 27, 2025 •

edited

Loading