Revisiting Classifier Two-Sample Tests

About

The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary classifiers. In particular, construct a dataset by pairing the $n$ examples in $S_P$ with a positive label, and by pairing the $m$ examples in $S_Q$ with a negative label. If the null hypothesis "$P = Q$" is true, then the classification accuracy of a binary classifier on a held-out subset of this dataset should remain near chance-level. As we will show, such Classifier Two-Sample Tests (C2ST) learn a suitable representation of the data on the fly, return test statistics in interpretable units, have a simple null distribution, and their predictive uncertainty allow to interpret where $P$ and $Q$ differ. The goal of this paper is to establish the properties, performance, and uses of C2ST. First, we analyze their main theoretical properties. Second, we compare their performance against a variety of state-of-the-art alternatives. Third, we propose their use to evaluate the sample quality of generative models with intractable likelihoods, such as Generative Adversarial Networks (GANs). Fourth, we showcase the novel application of GANs together with C2ST for causal discovery.

David Lopez-Paz, Maxime Oquab• 2016

Related benchmarks

Task	Dataset	Result
Two-sample testing	CIFAR-10 vs CIFAR-10.1 (test)	Power0.529	175
Two-sample testing	higgs	Test Power100	159
Two-sample testing	CIFAR10-RES18 (test)	Test Power45.8	97
Two-sample testing	CIFAR-10 vs CIFAR-10.1 1.0 (test)	Test Power0.062	54
Two-sample testing	CIFAR10-WRN8	Test Power35.5	49
Two-sample testing	CIFAR10 WRN28	Test Power13.5	49
Two-sample testing	BLOB (test)	Test Power12.8	49
Two-sample testing	Blob	Test Power0.085	49
Two-sample testing	Gaussian mixture data Synthetic Example 1 d=10	Test Power100	44
Two-sample test	Higgs alpha=0.05 (test)	Test Power97.4	42

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord