Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Open-domain Question Answering on NaturalQuestions standard (test)
Loading...
45.5
Accuracy
Ours (theory-guided context selection strategy)
38.22
40.11
42
43.89
Feb 9, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Ours (theory-guided context selection strategy)
Model=Qwen3-8B
2026.02
45.5
ReMem
Model=Qwen3-8B
2026.02
45.3
ExpRAG
Model=Qwen3-8B
2026.02
45.1
DC
Model=Qwen3-8B
2026.02
45
BM25
Model=Qwen3-8B
2026.02
44.6
Zero
Model=Qwen3-8B
2026.02
44.2
ReMem
Model=Llama-3.1-8B
2026.02
39.8
Ours (theory-guided context selection strategy)
Model=Llama-3.1-8B
2026.02
39.7
ExpRAG
Model=Llama-3.1-8B
2026.02
39.4
DC
Model=Llama-3.1-8B
2026.02
39.3
BM25
Model=Llama-3.1-8B
2026.02
38.9
Zero
Model=Llama-3.1-8B
2026.02
38.5
Feedback
Search any
task
Search any
task