Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Knowledge Reasoning benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Knowledge Reasoning
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
MMLU-Pro
SIGMA
Accuracy
91.43
120
5d ago
MMLU
SIGMA
MMLU Knowledge Reasoning Accuracy
92.16
73
14d ago
Knowledge Reasoning Benchmarks (MATH, GPQA, MMLU, GSM8K) (test)
PDA-GAM (3∆)
MATH Accuracy
89.6
24
1d ago
K-Cross (test)
CMA
Accuracy
41.5
22
5d ago
K-Cross (val)
PSO-Merging
Accuracy
43
22
5d ago
GPQA
LLaDA-2.1-flash
Accuracy
67.3
18
1mo ago
GPQA Diamond
TRICE-30B
Accuracy (avg@8)
75.4
16
26d ago
GPQA Diamond
General Teacher
Accuracy
62.5
12
7d ago
Tobacco Pest and Disease Control (test)
GraphRAG + ChatGLM
Accuracy
90.1
8
3mo ago
GPQA Diamond
Gemini-2.5 Flash-Thinking
pass@1
82.8
7
3mo ago
Super GPQA
Qwen2.5-32B-Instruct + Bootcamp-SFT-RL
Accuracy
51
6
13d ago
MMLU 57 subjects (test)
UTS-guided system
Accuracy
74.3
6
3mo ago
MMLU
DeepSeek-R1 0528 671B
Pass@1
89.9
6
3mo ago
BLINK
InterSketch
Accuracy
63
5
7d ago
MMLU Pro (test)
MiMo-VL-Miloco-7B
EM
68.5
5
3mo ago
MMMU
InterSketch
Accuracy
68.6
4
7d ago
MMLU c
1B Model (Multilingual classifier)
Normalized Accuracy
29.87
3
1mo ago
Include c
1B Model (Multilingual classifier)
Normalized Accuracy
35.41
3
1mo ago
GMMLU c
1B Model (Multilingual classifier)
Normalized Accuracy
32
3
1mo ago
CMMLU c
1B Model (Multilingual classifier)
Accuracy (Normalized)
36.08
3
1mo ago
GPQA Diamond
ExpertWeaver
Score
37.3
3
3mo ago
Showing 21 of 21 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs