Share your thoughts, 1 month free Claude Pro on usSee more

Active Question Answering on OpenEQA v1 (HM3D)

85.1LLM-Match

Human baseline

Updated 5mo ago

Evaluation Results

Method	Links
Human baseline 2025.12		85.1	-
R4 2025.12		74	63.37
3D-Mem 2025.12		52.6	42
GPT-4V 2025.12		41.8	7.5
GPT-4 w/ LLaVA-1.5 2025.12		38.1	7
GPT-4 2025.12		35.5	-
GPT-4 w/ CG 2025.12		34.4	6.5
GPT-4 w/ SVM 2025.12		34.2	6.4
LLaMA-2 w/ LLaVA-1.5 2025.12		30.9	5.9
LLaMA-2 w/ SVM 2025.12		29.9	5.5
LLaMA-2 2025.12		29	-
LLaMA-2 w/ CG 2025.12		23.9	4.3