Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AlphaToken: Decoupling Adaptation and Stability for Path-Aware Response Token Valuation in LLM Post-Training

About

Token selection is pivotal for effective LLM post-training. However, existing methods mostly rely on local heuristics and rarely formulate token selection as a principled valuation of individual response tokens. We introduce $\textbf{AlphaToken}$, a response token valuation framework that decouples valuation into $\textbf{adaptation}$ (promoting target-task learning) and $\textbf{stability}$ (preserving pre-trained capabilities), and makes each objective $\textbf{path-aware}$ by combining the direct-path signal from local token gradients with the downstream causal-path signal in autoregressive generation. Since retention data are typically unavailable, AlphaToken approximates stability via a $\textbf{Fisher-drift proxy}$ anchored at the pre-trained reference model. For efficient computation, we extend Ghost Dot-Product to token-level valuation. AlphaToken masks low-value response tokens during fine-tuning and preference optimization, concentrating training signals on more valuable positions. Experiments show that AlphaToken improves post-training performance and mitigates catastrophic forgetting.

Liu Qing, Ou Wu, Yi Du• 2026

Related benchmarks

TaskDatasetResultRank
Instruction FollowingAlpacaEval 2.0
Win Rate38.76
722
Commonsense ReasoningHellaSwag
HellaSwag Accuracy56.21
711
Multitask Language UnderstandingMMLU
Accuracy67.05
520
Instruction FollowingArena Hard
Win Rate34.6
263
Code GenerationHumanEval
HumanEval Score78.88
128
General Capability EvaluationGeneral Capability Suite MMLU, GSM8K, HumanEval, IFEval
Common Average Score72.59
39
General Capability EvaluationGeneral Capability Suite ARC-C, HellaSwag, MMLU, GSM8K
ARC-C Accuracy53.13
27
Science Question AnsweringARC-C
Accuracy (ARC-C)50.74
25
Overall Performance EvaluationConsolidated Evaluation Benchmark
Overall Average Score49.49
18
Preference AggregationPreference Evaluation Suite Aggregate
Average Preference Win Rate36.68
18
Showing 10 of 14 rows

Other info

Follow for update