Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-hop Reasoning on MultiHopRAG
Loading...
89.6
EM
Qwen2.5-OpAmp-72B
49.352
59.801
70.25
80.699
Feb 18, 2025
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
EM
Qwen2.5-OpAmp-72B
Parameters=72B, Adapta...
2025.02
89.6
Qwen2.5-72B-inst
Parameters=72B, Type=I...
2025.02
89.2
DeepSeek-V3
Version=V3
2025.02
88.6
GPT-4o-0806
Version=0806
2025.02
87.7
Llama3.3-70B-inst
Parameters=70B, Type=I...
2025.02
83.7
Llama3-ChatQA2-70B
Parameters=70B, Versio...
2025.02
78.2
Llama3.1-OpAmp-8B
Parameters=8B
2025.02
70.5
Mistral-7B-inst-v0.3
Parameters=7B
2025.02
69.5
Qwen2.5-7B-inst
Parameters=7B
2025.02
66.9
Llama3.1-8B-inst
Parameters=8B
2025.02
63.9
Llama3-ChatQA2-8B
Parameters=8B
2025.02
50.9
Feedback
Search any
task
Search any
task