| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| OpenEQA v1.0 (test) | CoV | LLM-Match67.7 | 32 | 1mo ago | |
| A-EQA | Human Agent | Overall (LLM-Match)85.1 | 25 | 26d ago | |
| OpenEQA | MiMo-Embodied-7B | Score74.1 | 21 | 1mo ago | |
| Episodic Memory EQA (EM-EQA) | LLM Match86.8 | 19 | 12d ago | ||
| OpenEQA EM-EQA | Accuracy86.8 | 19 | 1mo ago | ||
| RoboVQA | Tarsier2 | BLEU-177.1 | 13 | 1mo ago | |
| A-EQA 1.0 (test) | MetaNav | LLM-Match58.3 | 10 | 16d ago | |
| OpenEQA EM-EQA Episodes up to 32 frames | LLM-Match Score86.8 | 10 | 24d ago | ||
| OpenEQA (test) | GPT-4 Score86.8 | 10 | 19d ago | ||
| EgoTaskQA (test) | Exact Match80 | 10 | 1mo ago | ||
| OpenEQA EM-EQA | Human | LLM-Match86.8 | 8 | 1mo ago | |
| A-EQA 184 (subset) | 3D-Mem | LLM-Match52.6 | 7 | 1mo ago | |
| A-EQA 184-question subset | HGR | LLM-Match55.9 | 6 | 12d ago | |
| HM-EQA | FAST-EQA | SR69.2 | 6 | 1mo ago | |
| NiEH EQA (test) | LGT | Accuracy56.34 | 6 | 1mo ago | |
| DynHiL-EQA (Static) | DIVRR | Accuracy58.2 | 5 | 1mo ago | |
| DynHiL-EQA Dynamic | DIVRR | Accuracy55.1 | 5 | 1mo ago | |
| DynHiL-EQA (all) | DIVRR | Accuracy56.6 | 5 | 1mo ago | |
| HM-EQA (static) | DIVRR | Accuracy63.8 | 5 | 1mo ago | |
| OpenEQA v1 (test) | Score85.1 | 5 | 1mo ago | ||
| MT-HM3D | Memory-EQA | SR55.1 | 4 | 1mo ago | |
| Fine-EQA | VG-AVS | LLM-Match (Attr)59.08 | 4 | 1mo ago | |
| Open-EQA (test) | Fine-EQA w/ SFT+RL (VG-AVS) | Object Recognition52.8 | 4 | 1mo ago | |
| MP3D-EQA v1 (test) | NaviLLM | Success Rate (SR)47.78 | 4 | 1mo ago | |
| CAEQs 40 scenarios | ConEQsA | Accuracy65 | 3 | 1mo ago |