Share your thoughts, 1 month free Claude Pro on usSee more

Multi-hop Question Answering on MMhops Comparison (test)

29.39Accuracy

Gemini-2.5-pro

Updated 4mo ago

Evaluation Results

Method	Links
Gemini-2.5-pro 2025.12		29.39
Gemini-2.5-flash 2025.12		23.18
MMhops-R1 2025.12		22.01
Self-Ask 2025.12		18.27
OmniSearch 2025.12		17.02
Vanilla mRAG 2025.12		9.72
GPT-4o 2025.12		8.76
Zero-shot 2025.12		7.59
GPT-4o-mini 2025.12		7.05
Search-r1 2025.12		6.62
Zero-shot 2025.12		6.2
EchoSight 2025.12		4.81
Vanilla mRAG 2025.12		3.95