Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning to Compose Neural Networks for Question Answering

About

We describe a question answering model that applies to both images and structured knowledge bases. The model uses natural language strings to automatically assemble neural networks from a collection of composable modules. Parameters for these modules are learned jointly with network-assembly parameters via reinforcement learning, with only (world, question, answer) triples as supervision. Our approach, which we term a dynamic neural model network, achieves state-of-the-art results on benchmark datasets in both visual and structured domains.

Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Dan Klein• 2016

Related benchmarks

TaskDatasetResultRank
Visual Question AnsweringVQA (test-dev)
Acc (All)59.4
147
Visual Question AnsweringVQA (test-std)--
110
Open-Ended Visual Question AnsweringVQA 1.0 (test-dev)
Overall Accuracy59.4
100
Visual Question Answering (Multiple-choice)VQA 1.0 (test-dev)
Accuracy (All)62.5
66
Open-Ended Visual Question AnsweringVQA 1.0 (test-standard)
Overall Accuracy59.4
50
Visual Question AnswerVQA 1.0 (test-dev)
Overall Accuracy59.4
44
Open-Ended Visual Question AnsweringVQA (test-standard)
Accuracy (Overall)59.4
32
Visual Question AnsweringVQA 1 (test-standard)
VQA Open-Ended Accuracy (All)59.4
28
Visual ReasoningNLVR v1 (Test-U)
Accuracy62
8
Showing 9 of 9 rows

Other info

Follow for update