Our new X account is live! Follow @wizwand_team for updates
Search any
task
Feedback
Search any
task
SOTA Knowledge benchmarks and papers with code | Wizwand
Our new X account is live! Follow @wizwand_team for updates
Home
/
Tasks
Knowledge
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
MMLU
DictaLM 3.0 24B-Think
Accuracy
85.93
71
2d ago
GPQA
DeepSeek-R1-Distill-Qwen-32B (Reasoning)
Accuracy
59.39
34
2d ago
ARC Easy
DeepSeek-R1-Distill-Qwen-32B (Reasoning)
ARC-E Score
99.54
31
2d ago
ARC Challenge
ReasonAny
ARC-C Score
96.43
31
2d ago
MMLU-Pro
Ling-flash-2.0
Score
77.55
30
4d ago
MMB
TAIA
Accuracy
61.98
21
4d ago
CommonSenseQA CoQA
Dense
Score
66.91
20
4d ago
GPQA
Ling-flash-2.0
Score
69.16
17
4d ago
C-EVAL
Qwen3-30B-A3B-Inst-2507
Score
88.12
12
4d ago
OpenBookQA (test)
Qwen3-Omni-Instruct
Accuracy
92.31
11
4d ago
MMSU (test)
Qwen3-Omni-Instruct
Performance
77
11
4d ago
TriviaQA
LLaDA2.1-flash
Score
72.93
10
4d ago
PHYBench
LLaDA2.0-flash
Score
30.06
10
4d ago
GPQA
ReasonAny
GPQA Score
57.5
9
2d ago
Knowledge Suite
Task Arithmetic
ARC-C
84.07
9
2d ago
CLIcK
DeepSeek-V3.1
Score
80.9
7
4d ago
KMMLU-Pro
Qwen3-235B-A22B Instruct-2507
Score
70.9
7
4d ago
CMMLU
MiniCPM-4.1
Knowledge Score
84.72
6
4d ago
C-Eval
Qwen3-1.7B-ALLMEM
C-Eval Knowledge Accuracy
0.589
4
4d ago
PopQA
DictaLM 3.0 24B-Think
Accuracy
30.59
3
2d ago
KMMLU Redux
DeepSeek-V3.1
Score
75.9
3
4d ago
KMMLU
DeepSeek-V3.1
Score
0.787
3
4d ago
OpenThaiEval
Typhoon-S-8B
Score
67.06
2
4d ago
MMLU EN
Qwen3-8B
Knowledge Score
42.18
2
4d ago
GPQA EN
Qwen3-8B
Score
41.41
2
4d ago
Showing 25 of 25 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Terms of Service
FAQs