Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Knowledge-intensive QA on NaturalQuestions (NQ) 5-shot
Loading...
42.85
EM Accuracy
ADAFUSE (Top-2 Base)
23.09
28.22
33.35
38.48
Jan 9, 2026
EM Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
EM Accuracy
ADAFUSE (Top-2 Base)
Base model selection s...
2026.01
42.85
ADAFUSE (Fixed Base)
Base model selection s...
2026.01
42.85
UniTE
Base model selection s...
2026.01
38.95
SWEETSPAN
Base model selection s...
2026.01
37.59
DEEPEN
Base model selection s...
2026.01
37.42
LLM-BLENDER
Base model selection s...
2026.01
36.56
LLaMA-3.1-8B-Instruct
Model Type=Base Model
2026.01
34.24
Mistral-7B-Instruct-v0.3
Model Type=Base Model
2026.01
32.05
InternLM3-8B-Instruct
Model Type=Base Model
2026.01
27.15
Qwen3-8B
Model Type=Base Model
2026.01
23.85
Feedback
Search any
task
Search any
task