Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts

About

Leveraging the models' outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution (OOD) samples without requiring access to the corresponding ground truth labels. Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to prediction bias, especially under the natural shift. In this work, we first study the relationship between logits and generalization performance from the view of low-density separation assumption. Our findings motivate our proposed method MaNo which (1) applies a data-dependent normalization on the logits to reduce prediction bias, and (2) takes the $L_p$ norm of the matrix of normalized logits as the estimation score. Our theoretical analysis highlights the connection between the provided score and the model's uncertainty. We conduct an extensive empirical study on common unsupervised accuracy estimation benchmarks and demonstrate that MaNo achieves state-of-the-art performance across various architectures in the presence of synthetic, natural, or subpopulation shifts. The code is available at \url{https://github.com/Renchunzi-Xie/MaNo}.

Renchunzi Xie, Ambroise Odonnat, Vasilii Feofanov, Weijian Deng, Jianfeng Zhang, Bo An• 2024

Related benchmarks

TaskDatasetResultRank
Accuracy EstimationPACS
R20.924
50
Accuracy EstimationEntity-13 Subpopulation Shift
R20.993
36
Accuracy EstimationEntity-30 Subpopulation Shift
R20.991
36
Accuracy EstimationLiving-17 Subpopulation Shift
R20.98
36
Unsupervised Accuracy EstimationOffice-Home
R^20.926
36
Unsupervised Accuracy EstimationDomainNet
R^20.91
36
Accuracy EstimationNonliving-26 Subpopulation Shift
R20.978
36
Unsupervised Accuracy EstimationRR1-WILDS
R-squared0.983
36
Accuracy EstimationTinyImageNet
MAE0.612
27
Accuracy EstimationImageNet
MAE0.695
27
Showing 10 of 23 rows

Other info

Follow for update