Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive Decoding

About

While Large Language Models (LLMs) support long contexts, they struggle with performance degradation within the context window. Current solutions incur prohibitive training costs, leaving statistical behaviors and cost-effective approaches underexplored. From the decoding perspective, we identify the Posterior Salience Attenuation (PSA) phenomenon, where the salience ratio correlates with long-text performance degradation. Notably, despite the attenuation, gold tokens still occupy high-ranking positions in the decoding space. Motivated by it, we propose the training-free Positional Contrastive Decoding (PCD) that contrasts the logits derived from long-aware attention with those from designed local-aware attention, enabling the model to focus on the gains introduced by large-scale short-to-long training. Through the analysis of long-term decay simulation, we demonstrate that PCD effectively alleviates attention score degradation. Experimental results show that PCD achieves state-of-the-art performance on long-context benchmarks.

Zikai Xiao, Ziyang Wang, Wen Ma, Yan Zhang, Wei Shen, Yan Wang, Luqi Gong, Zuozhu Liu• 2025

Related benchmarks

TaskDatasetResultRank
Long-context Question AnsweringLongBench (test)
HotpotQA15.29
59
Key-Value RetrievalInfiniteBench 4k
Accuracy100
12
Key-Value RetrievalInfiniteBench 8k
Accuracy96
12
Variable TrackingRULER 4k
F1 Score81.8
12
Variable TrackingRULER 8k
F1 Score77.92
12
Key-Value RetrievalInfiniteBench 16k
Accuracy (%)87
10
Variable TrackingRULER 16k
F1 Score69.11
10
Showing 7 of 7 rows

Other info

Follow for update