Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage?

About

This paper questions the effectiveness of a modern predictive uncertainty quantification approach, called \emph{evidential deep learning} (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their perceived strong empirical performance on downstream tasks, a line of recent studies by Bengs et al. identify limitations of the existing methods to conclude their learned epistemic uncertainties are unreliable, e.g., in that they are non-vanishing even with infinite data. Building on and sharpening such analysis, we 1) provide a sharper understanding of the asymptotic behavior of a wide class of EDL methods by unifying various objective functions; 2) reveal that the EDL methods can be better interpreted as an out-of-distribution detection algorithm based on energy-based-models; and 3) conduct extensive ablation studies to better assess their empirical effectiveness with real-world datasets. Through all these analyses, we conclude that even when EDL methods are empirically effective on downstream tasks, this occurs despite their poor uncertainty quantification capabilities. Our investigation suggests that incorporating model uncertainty can help EDL methods faithfully quantify uncertainties and further improve performance on representative downstream tasks, albeit at the cost of additional computational complexity.

Maohao Shen, J. Jon Ryu, Soumya Ghosh, Yuheng Bu, Prasanna Sattigeri, Subhro Das, Gregory W. Wornell• 2024

Related benchmarks

Task	Dataset	Result
OOD Detection	CIFAR-10 (IND) SVHN (OOD)	AUROC0.978	138
Selective Classification	CIFAR-10 (test)	Accuracy89.4	98
OOD Detection	CIFAR-100 IND SVHN OOD	AUROC (%)85.6	81
OOD Detection	CIFAR10 ID FMNIST OOD	AUROC0.944	54
OOD Detection	CIFAR-10 OOD (test)	AUROC98.4	46
OOD Detection	CIFAR100 ID TImageNet OOD	AUROC0.841	38
OOD Detection	CIFAR-10 vs SVHN (test)	--	34
Selective Classification	CIFAR-100 (test)	AUC0.857	32
OOD Detection	TinyImageNet (In-distribution) / CIFAR10 (OOD)	AUPR85.6	24
OOD Detection	CIFAR-10 IND ImageNet R OOD	AUROC89.6	20

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord