Robustness Quantification and Uncertainty Quantification: Comparing Two Methods for Assessing the Reliability of Classifier Predictions

About

We consider two approaches for assessing the reliability of the individual predictions of a classifier: Robustness Quantification (RQ) and Uncertainty Quantification (UQ). We explain the conceptual differences between the two approaches, compare both approaches on a number of benchmark datasets and show that RQ is capable of outperforming UQ, both in a standard setting and in the presence of distribution shift. Beside showing that RQ can be competitive with UQ, we also demonstrate the complementarity of RQ and UQ by showing that a combination of both approaches can lead to even better reliability assessments.

Adri\'an Detavernier, Jasper De Bock• 2026

Related benchmarks

Task	Dataset	Result
Reliability Assessment	D7 (test)	AU-ARC95.11	5
Reliability Assessment	D8 (test)	AU-ARC50.3	5
Reliability Assessment	D10 (test)	AU-ARC0.9029	5
Reliability Assessment	D11 (test)	AU-ARC87.36	5
Reliability Assessment	D13 (test)	AU-ARC95.82	5
Reliability Assessment	D1 (test)	AU-ARC91.96	5
Reliability Assessment	D2 (test)	AU-ARC92.1	5
Reliability Assessment	D3 (test)	AU-ARC94.56	5
Reliability Assessment	D4 (test)	AU-ARC0.9968	5
Reliability Assessment	D5 (test)	AU-ARC87.46	5

Showing 10 of 28 rows

Other info

Follow for update

@wizwand_team Discord