Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering on PopQA (Accuracy)
Loading...
41.3
Accuracy
RAG
8.8208
17.2529
25.685
34.1171
Feb 2, 2026
Feb 3, 2026
Feb 5, 2026
Feb 7, 2026
Feb 9, 2026
Feb 11, 2026
Feb 13, 2026
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy
RAG
Backbone=Qwen2.5-14B-I...
2026.02
41.3
RAG
Backbone=Qwen2.5-3B-In...
2026.02
39.57
CDKC
Backbone=Qwen2.5-14B-I...
2026.02
35.2
GRPO
Backbone=Qwen2.5-14B-I...
2026.02
34.23
DictaLM 3.0 12B-Inst
Parameters=12B, Varian...
2026.02
26.31
CGKE
Backbone=Qwen2.5-14B-I...
2026.02
23.53
CDKC
Backbone=Qwen2.5-3B-In...
2026.02
22.87
Gemma 3 12B
Parameters=12B
2026.02
22.64
GRPO
Backbone=Qwen2.5-3B-In...
2026.02
22.53
CoT
Backbone=Qwen2.5-14B-I...
2026.02
21.03
Vanilla SFT
Backbone=Qwen2.5-14B-I...
2026.02
20.73
Vanilla LLM
Backbone=Qwen2.5-14B-I...
2026.02
19.6
CGKE
Backbone=Qwen2.5-3B-In...
2026.02
16.13
Vanilla SFT
Backbone=Qwen2.5-3B-In...
2026.02
11.9
CoT
Backbone=Qwen2.5-3B-In...
2026.02
11.3
Vanilla LLM
Backbone=Qwen2.5-3B-In...
2026.02
10.07
Feedback
Search any
task
Search any
task