Measuring abstract reasoning in neural networks

About

Whether neural networks can learn abstract reasoning or whether they merely rely on superficial statistics is a topic of recent debate. Here, we propose a dataset and challenge designed to probe abstract reasoning, inspired by a well-known human IQ test. To succeed at this challenge, models must cope with various generalisation `regimes' in which the training and test data differ in clearly-defined ways. We show that popular models such as ResNets perform poorly, even when the training and test sets differ only minimally, and we present a novel architecture, with a structure designed to encourage reasoning, that does significantly better. When we vary the way in which the test questions and training data differ, we find that our model is notably proficient at certain forms of generalisation, but notably weak at others. We further show that the model's ability to generalise improves markedly if it is trained to predict symbolic explanations for its answers. Altogether, we introduce and explore ways to both measure and induce stronger abstract reasoning in neural networks. Our freely-available dataset should motivate further progress in this direction.

David G.T. Barrett, Felix Hill, Adam Santoro, Ari S. Morcos, Timothy Lillicrap• 2018

Related benchmarks

Task	Dataset	Result
Abstract Visual Reasoning	SVRT reformulated four-choice (test)	Accuracy70.3	28
Abstract Visual Reasoning	RAVEN v1 (test)	Average Accuracy34	22
Abstract Reasoning	I-RAVEN (test)	Accuracy (Overall)14.87	21
Few-shot classification	Bongard-LOGO (test)	Free-Form Score50.1	21
Compositional Visual Reasoning	CVR	Accuracy (Joint)64.5	16
Abstract Visual Reasoning	MC2R 20 samples (train)	Accuracy10.6	12
Abstract Visual Reasoning	MC2R (100 train samples)	Accuracy11	12
Abstract Visual Reasoning	MC2R 50 samples (train)	Accuracy10.5	12
Abstract Visual Reasoning	MC2R 200 (train)	Accuracy11	12
Abstract Visual Reasoning	MC2R 500 samples (train)	Accuracy0.116	12

Showing 10 of 29 rows

Other info

Follow for update

@wizwand_team Discord