Embodied Question Answering
About
We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where an agent is spawned at a random location in a 3D environment and asked a question ("What color is the car?"). In order to answer, the agent must first intelligently navigate to explore the environment, gather information through first-person (egocentric) vision, and then answer the question ("orange"). This challenging task requires a range of AI skills -- active perception, language understanding, goal-driven navigation, commonsense reasoning, and grounding of language into actions. In this work, we develop the environments, end-to-end-trained reinforcement learning agents, and evaluation protocols for EmbodiedQA.
Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra• 2017
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Episodic Memory Question Answering (Egocentric pixel) | Matterport3D | IoU5.26 | 8 | |
| Episodic Memory Question Answering (Top-down map) | Matterport3D | IoU4.75 | 8 | |
| Embodied Question Answering | MP3D-EQA v1 (test) | -- | 4 |
Showing 3 of 3 rows