ALIEN: Aligned Entropy Head for Improving Uncertainty Estimation of LLMs

About

Uncertainty estimation remains a key challenge when adapting pre-trained language models to downstream classification tasks, with overconfidence often observed for difficult inputs. While predictive entropy provides a strong baseline for uncertainty estimation, it considers mainly aleatoric uncertainty and has limited capacity to capture effects, such as class overlap or ambiguous linguistic cues. We introduce Aligned Entropy - ALIEN, a lightweight method that refines entropy-based uncertainty by aligning it with prediction reliability. ALIEN trains a small uncertainty head initialized to produce the model's original entropy and subsequently fine-tuned with two regularization mechanisms. Experiments across seven classification datasets and two NER benchmarks, evaluated on five language models (RoBERTa, ELECTRA, LLaMA-2, Qwen2.5, and Qwen3), show that ALIEN consistently outperforms strong baselines across all considered scenarios in detecting incorrect predictions, while achieving the lowest calibration error. The proposed method introduces only a small inference overhead (in the order of milliseconds per batch on CPU) and increases the model's parameter count by just 0.002% for decoder models and 0.5% for encoder models, without requiring storage of intermediate states. It improves uncertainty estimation while preserving the original model architecture, making the approach practical for large-scale deployment with modern language models. Our results demonstrate that entropy can be effectively refined through lightweight supervised alignment, producing more reliable uncertainty estimates without modifying the backbone model. The code is available at 4.

Artem Zabolotnyi, Roman Makarov, Mile Mitrovic, Polina Proskura, Oleg Travkin, Roman Alferov, Alexey Zaytsev• 2025

Related benchmarks

Task	Dataset	Result
Misclassification Detection	COLA	ROC-AUC81.6	31
Uncertainty Estimation	Aggregate (Cola, GEmot, IMDB, News, SST5, Toxigen, YELP)	ECE8.4	13
Misclassification Detection	GEmot	ROC AUC68.2	11
Misclassification Detection	IMDB	ROC-AUC89.1	10
Misclassification Detection	News	ROC-AUC (Misclassification Detection)88.8	10
Misclassification Detection	SST5	ROC-AUC63.9	10
Misclassification Detection	Tox	ROC-AUC81.6	10
Misclassification Detection	Yelp	ROC-AUC88.5	10
Misclassification Detection	CoNLL 2003 (test)	ROC-AUC89.6	7
Misclassification Detection	WNUT 2017 (test)	ROC-AUC0.911	7

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord