Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Noisy-RAG Question Answering on CoQA
Loading...
92.4
Exact Match (EM)
Qwen2.5-OpAmp-72B
69.728
75.614
81.5
87.386
Feb 18, 2025
Exact Match (EM)
Updated 4d ago
Evaluation Results
Method
Method
Links
Exact Match (EM)
Qwen2.5-OpAmp-72B
Parameters=72B, Adapta...
2025.02
92.4
GPT-4o-0806
Version=0806
2025.02
88.6
DeepSeek-V3
Version=V3
2025.02
88.4
Llama3.3-70B-inst
Parameters=70B, Type=I...
2025.02
88.2
Qwen2.5-72B-inst
Parameters=72B, Type=I...
2025.02
85.8
Llama3.1-OpAmp-8B
Parameters=8B
2025.02
85.4
Qwen2.5-7B-inst
Parameters=7B
2025.02
84.2
Llama3.1-8B-inst
Parameters=8B
2025.02
82.2
Llama3-ChatQA2-70B
Parameters=70B, Versio...
2025.02
80.2
Llama3-ChatQA2-8B
Parameters=8B
2025.02
78.2
Mistral-7B-inst-v0.3
Parameters=7B
2025.02
70.6
Feedback
Search any
task
Search any
task