Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering on HotQA
Loading...
48.5
Accuracy
CDKC
15.636
24.168
32.7
41.232
Feb 13, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
CDKC
Backbone=Qwen2.5-14B-I...
2026.02
48.5
GRPO
Backbone=Qwen2.5-14B-I...
2026.02
48.17
CDKC
Backbone=Qwen2.5-3B-In...
2026.02
40.07
GRPO
Backbone=Qwen2.5-3B-In...
2026.02
38.67
RAG
Backbone=Qwen2.5-14B-I...
2026.02
38.6
CGKE
Backbone=Qwen2.5-14B-I...
2026.02
33.27
CoT
Backbone=Qwen2.5-14B-I...
2026.02
32.9
RAG
Backbone=Qwen2.5-3B-In...
2026.02
32.57
Vanilla SFT
Backbone=Qwen2.5-14B-I...
2026.02
28
Vanilla LLM
Backbone=Qwen2.5-14B-I...
2026.02
26.4
CGKE
Backbone=Qwen2.5-3B-In...
2026.02
25.47
CoT
Backbone=Qwen2.5-3B-In...
2026.02
21.43
Vanilla SFT
Backbone=Qwen2.5-3B-In...
2026.02
19.9
Vanilla LLM
Backbone=Qwen2.5-3B-In...
2026.02
16.9
Feedback
Search any
task
Search any
task