Training independent subnetworks for robust prediction

About

Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network. However, these methods still require multiple forward passes for prediction, leading to a significant computational cost. In this work, we show a surprising result: the benefits of using multiple predictions can be achieved `for free' under a single model's forward pass. In particular, we show that, using a multi-input multi-output (MIMO) configuration, one can utilize a single model's capacity to train multiple subnetworks that independently learn the task at hand. By ensembling the predictions made by the subnetworks, we improve model robustness without increasing compute. We observe a significant improvement in negative log-likelihood, accuracy, and calibration error on CIFAR10, CIFAR100, ImageNet, and their out-of-distribution variants compared to previous methods.

Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran• 2020

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-10 (test)	Accuracy95.4	3381
Image Classification	CIFAR-100	Accuracy78.69	691
Image Classification	CIFAR-100	--	302
Image Classification	CIFAR-10-C	Accuracy69.99	179
Out-of-Distribution Detection	ImageNet-1K	FPR@9552.64	156
Classification	CIFAR-100 (test)	Accuracy56.2	129
Image Classification	CIFAR-100-C	Accuracy (Corruption)47.35	109
Out-of-Distribution Detection	CIFAR-10 vs SVHN	AUC0.9387	38
Image Classification	CIFAR-100 WRN-28-10 (test)	Top-1 Accuracy82.74	28
Glaucoma Classification	retinal Glaucoma dataset (test)	Accuracy0.724	28

Showing 10 of 24 rows

Other info

Follow for update

@wizwand_team Discord