Skip to content

Layer-wise KV Cache Allocation for Models with Alternating Attention Patterns #6300

Layer-wise KV Cache Allocation for Models with Alternating Attention Patterns

Layer-wise KV Cache Allocation for Models with Alternating Attention Patterns #6300