See how important each token of the context was for the LLM response #8753

LiquidGunay · 2024-07-29T15:08:13Z

LiquidGunay
Jul 29, 2024

I think the ability to get something like an average attention score for each token of the context would be really useful to see what parts of the context did the LLM "focus" more on. This would be fairly useful for RAG and QA applications.

onestardao · 2025-07-30T12:39:28Z

onestardao
Jul 30, 2025

Great point — measuring token-level attention focus sounds simple, but it quickly gets murky once you factor in entropy collapse and multi-head overlap.

Some of the deeper issues I’ve run into when tracing this kind of behavior:

No.9 Entropy Collapse — attention becomes overly concentrated or flat, making scores meaningless without modulation
No.4 Overconfidence / Bluffing — the model “claims” to have used some tokens, but actual heads show near-zero usage
No.6 Logic Collapse — certain attention sequences break reasoning chains, even if token weight seems high

I’ve been experimenting with ways to modulate attention per-head (e.g., diversity injection, illegal path suppression) to recover clearer patterns — sort of like giving each head a semantic identity instead of letting them collapse into noise.

If you’re diving into attention introspection, would love to trade notes.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

See how important each token of the context was for the LLM response #8753

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

See how important each token of the context was for the LLM response #8753

Uh oh!

LiquidGunay Jul 29, 2024

Replies: 1 comment

Uh oh!

onestardao Jul 30, 2025

LiquidGunay
Jul 29, 2024

onestardao
Jul 30, 2025