Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning on MMLU-Pro 233-sample stratified
Loading...
86.71
Accuracy
DPA
6.8796
27.6048
48.33
69.0552
May 7, 2026
Accuracy
Updated 26d ago
Evaluation Results
Method
Method
Links
Accuracy
DPA
Model=Qwen3.5-9B, Prep...
2026.05
86.71
ITI
Model=Qwen3.5-9B, Prep...
2026.05
79.81
Instruction
Model=Qwen3.5-9B, Prep...
2026.05
78.87
Original
Model=Qwen3.5-9B, Prep...
2026.05
77.62
DoLa
Model=Qwen3.5-9B, Prep...
2026.05
75.94
TAE
Model=Qwen3.5-9B, Prep...
2026.05
75
CDS
Model=Qwen3.5-9B, Prep...
2026.05
60.99
CDS
Model=LLaMA3.1-8B, Pre...
2026.05
34.01
DPA
Model=LLaMA3.1-8B, Pre...
2026.05
28.78
DoLa
Model=LLaMA3.1-8B, Pre...
2026.05
24.62
TAE
Model=LLaMA3.1-8B, Pre...
2026.05
23.61
Original
Model=LLaMA3.1-8B, Pre...
2026.05
21.43
Instruction
Model=LLaMA3.1-8B, Pre...
2026.05
20.59
ITI
Model=LLaMA3.1-8B, Pre...
2026.05
9.95
Feedback
Search any
task
Search any
task