On the Importance of Feature Separability in Predicting Out-Of-Distribution Error

About

Estimating the generalization performance is practically challenging on out-of-distribution (OOD) data without ground-truth labels. While previous methods emphasize the connection between distribution difference and OOD accuracy, we show that a large domain gap not necessarily leads to a low test accuracy. In this paper, we investigate this problem from the perspective of feature separability empirically and theoretically. Specifically, we propose a dataset-level score based upon feature dispersion to estimate the test accuracy under distribution shift. Our method is inspired by desirable properties of features in representation learning: high inter-class dispersion and high intra-class compactness. Our analysis shows that inter-class dispersion is strongly correlated with the model accuracy, while intra-class compactness does not reflect the generalization performance on OOD data. Extensive experiments demonstrate the superiority of our method in both prediction performance and computational efficiency.

Renchunzi Xie, Hongxin Wei, Lei Feng, Yuzhou Cao, Bo An• 2023

Related benchmarks

Task	Dataset	Result
Accuracy Estimation	PACS	R20.832	50
Accuracy Estimation	Nonliving-26 Subpopulation Shift	R20.958	36
Accuracy Estimation	Entity-13 Subpopulation Shift	R20.937	36
Accuracy Estimation	Living-17 Subpopulation Shift	R20.931	36
Accuracy Estimation	Entity-30 Subpopulation Shift	R20.929	36
Unsupervised Accuracy Estimation	Office-Home	R^20.456	36
Unsupervised Accuracy Estimation	RR1-WILDS	R-squared0.843	36
Unsupervised Accuracy Estimation	DomainNet	R^20.202	36
Accuracy Estimation	TinyImageNet	MAE1.054	27
Accuracy Estimation	ImageNet	MAE2.602	27

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord