| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| 3D Question Answering | ScanQA (val) | METEOR23.1 | 217 | |
| 3D Question Answering | ScanQA w/ objects (test) | EM@131.29 | 55 | |
| 3D Question Answering | ScanQA w/o objects (test) | EM@130.87 | 51 | |
| 3D Question Answering | ScanQA | EM (Exact Match)30.3 | 38 | |
| 3D Question Answering | ScanQA v1.0 (test) | ROUGE72.7 | 26 | |
| 3D Question Answering | ScanQA (test) | BLEU-413.9 | 20 | |
| 3D Scene Understanding | ScanQA | METEOR20.8 | 16 | |
| Visual Question Answering | ScanQA | CIDEr88.77 | 16 | |
| Question Answering | ScanQA v1.0 (val) | EM@122.59 | 16 | |
| Vision-Language Reasoning | ScanQA ScanNet scenes (test) | BLEU-143.3 | 13 | |
| 3D Question Answering | ScanQA v1.0 (val) | BLEU-416.2 | 13 | |
| 3D Visual Question Answering | ScanQA | C Score87.7 | 10 | |
| 3D Scene Question Answering | ScanQA | M Score22.9 | 6 | |
| 3D Visual Question Answering | ScanQA | EM@123.2 | 5 | |
| 3D Question Answering | ScanQA Leaderboard as of Nov 2023 (test) | EM30.82 | 5 | |
| Question Answering | ScanQA | EM30.5 | 4 | |
| QA-driven Grounding | ScanQA | F1@5040.8 | 3 | |
| Question Answering | ScanQA | EM-R24.5 | 3 | |
| Object localization | ScanQA w/ objects (test) | Acc@0.2525.44 | 3 | |
| Object localization | ScanQA (val) | Acc@0.2524.96 | 3 | |
| Visual Grounding | ScanQA-G | F1@5040.8 | 2 |