Min-$k$ Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics
About
The quality of text generated by large language models depends critically on the decoding sampling strategy. While mainstream methods such as Top-$k$, Top-$p$, and Min-$p$ achieve a balance between diversity and accuracy through probability-space truncation, they share an inherent limitation: extreme sensitivity to the temperature parameter. Recent logit-space approaches like Top-$n\sigma$ achieve temperature invariance but rely on global statistics that are susceptible to long-tail noise, failing to capture fine-grained confidence structures among top candidates. We propose \textbf{Min-$k$ Sampling}, a novel dynamic truncation strategy that analyzes the local shape of the sorted logit distribution to identify "semantic cliffs": sharp transitions from high-confidence core tokens to uncertain long-tail tokens. By computing a position-weighted relative decay rate, Min-$k$ dynamically determines truncation boundaries at each generation step. We formally prove that Min-$k$ achieves strict temperature invariance and empirically demonstrate its low sensitivity to hyperparameter choices. Experiments on multiple reasoning benchmarks, creative writing tasks, and human evaluation show that Min-$k$ consistently improves text quality, maintaining robust performance even under extreme temperature settings where probability-based methods collapse. We make our code, models, and analysis tools publicly available.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Scientific Reasoning | GPQA Main | Accuracy30.8 | 67 | |
| Mathematical Reasoning | AQUA | AQuA Exact Match79.92 | 60 | |
| Mathematical Reasoning | GSM8K | Exact Match Accuracy (GSM8K)93.63 | 60 | |
| Science Question Answering | GPQA main (test) | Exact Match Accuracy40.85 | 60 | |
| Mathematics Problem Solving | MATH500 (test) | Exact Match Accuracy61.2 | 60 | |
| Mathematical Reasoning | MATH 500 | Exact Match59.8 | 60 | |
| Mathematical Reasoning | GSM8K (test) | Exact Match Accuracy (GSM8K Test)95.6 | 60 | |
| Creative Writing | Creative Writing | Win Rate58.6 | 36 | |
| Creative Writing | Creative Writing Human Evaluation | Human Preference Count75 | 9 |