Robustness Quantification and Uncertainty Quantification: Comparing Two Methods for Assessing the Reliability of Classifier Predictions
About
We consider two approaches for assessing the reliability of the individual predictions of a classifier: Robustness Quantification (RQ) and Uncertainty Quantification (UQ). We explain the conceptual differences between the two approaches, compare both approaches on a number of benchmark datasets and show that RQ is capable of outperforming UQ, both in a standard setting and in the presence of distribution shift. Beside showing that RQ can be competitive with UQ, we also demonstrate the complementarity of RQ and UQ by showing that a combination of both approaches can lead to even better reliability assessments.
Adri\'an Detavernier, Jasper De Bock• 2026
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Reliability Assessment | D7 (test) | AU-ARC95.11 | 5 | |
| Reliability Assessment | D8 (test) | AU-ARC50.3 | 5 | |
| Reliability Assessment | D10 (test) | AU-ARC0.9029 | 5 | |
| Reliability Assessment | D11 (test) | AU-ARC87.36 | 5 | |
| Reliability Assessment | D13 (test) | AU-ARC95.82 | 5 | |
| Reliability Assessment | D1 (test) | AU-ARC91.96 | 5 | |
| Reliability Assessment | D2 (test) | AU-ARC92.1 | 5 | |
| Reliability Assessment | D3 (test) | AU-ARC94.56 | 5 | |
| Reliability Assessment | D4 (test) | AU-ARC0.9968 | 5 | |
| Reliability Assessment | D5 (test) | AU-ARC87.46 | 5 |
Showing 10 of 28 rows