Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ALIEN: Aligned Entropy Head for Improving Uncertainty Estimation of LLMs

About

Uncertainty estimation remains a key challenge when adapting pre-trained language models to downstream classification tasks, with overconfidence often observed for difficult inputs. While predictive entropy provides a strong baseline for uncertainty estimation, it considers mainly aleatoric uncertainty and has limited capacity to capture effects, such as class overlap or ambiguous linguistic cues. We introduce Aligned Entropy - ALIEN, a lightweight method that refines entropy-based uncertainty by aligning it with prediction reliability. ALIEN trains a small uncertainty head initialized to produce the model's original entropy and subsequently fine-tuned with two regularization mechanisms. Experiments across seven classification datasets and two NER benchmarks, evaluated on five language models (RoBERTa, ELECTRA, LLaMA-2, Qwen2.5, and Qwen3), show that ALIEN consistently outperforms strong baselines across all considered scenarios in detecting incorrect predictions, while achieving the lowest calibration error. The proposed method introduces only a small inference overhead (in the order of milliseconds per batch on CPU) and increases the model's parameter count by just 0.002% for decoder models and 0.5% for encoder models, without requiring storage of intermediate states. It improves uncertainty estimation while preserving the original model architecture, making the approach practical for large-scale deployment with modern language models. Our results demonstrate that entropy can be effectively refined through lightweight supervised alignment, producing more reliable uncertainty estimates without modifying the backbone model. The code is available at 4.

Artem Zabolotnyi, Roman Makarov, Mile Mitrovic, Polina Proskura, Oleg Travkin, Roman Alferov, Alexey Zaytsev• 2025

Related benchmarks

TaskDatasetResultRank
Misclassification DetectionCOLA
ROC-AUC81.6
31
Uncertainty EstimationAggregate (Cola, GEmot, IMDB, News, SST5, Toxigen, YELP)
ECE8.4
13
Misclassification DetectionGEmot
ROC AUC68.2
11
Misclassification DetectionIMDB
ROC-AUC89.1
10
Misclassification DetectionNews
ROC-AUC (Misclassification Detection)88.8
10
Misclassification DetectionSST5
ROC-AUC63.9
10
Misclassification DetectionTox
ROC-AUC81.6
10
Misclassification DetectionYelp
ROC-AUC88.5
10
Misclassification DetectionCoNLL 2003 (test)
ROC-AUC89.6
7
Misclassification DetectionWNUT 2017 (test)
ROC-AUC0.911
7
Showing 10 of 10 rows

Other info

Follow for update