Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reasoning on MDM (test)
Loading...
58.3
Accuracy
TCR-gold
-2.332
13.409
29.15
44.891
Jan 29, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
TCR-gold
Backbone=Qwen2.5-7B-In...
2026.01
58.3
TCR
Backbone=Qwen2.5-7B-In...
2026.01
48.2
TCR-gold
Backbone=Qwen3-8B-Inst...
2026.01
47.1
Qwen2.5-7B-Instruct
Backbone=Qwen2.5-7B-In...
2026.01
43
DoLa
Backbone=Qwen2.5-7B-In...
2026.01
38.5
TCR
Backbone=Qwen3-8B-Inst...
2026.01
29.3
Qwen3-8B-Instruct
Backbone=Qwen3-8B-Inst...
2026.01
24.7
TCR-gold
Backbone=Phi-3-Instruc...
2026.01
15.8
TCR
Backbone=Phi-3-Instruc...
2026.01
11.5
DoLa
Backbone=Qwen3-8B-Inst...
2026.01
9
Phi-3-Instruct
Backbone=Phi-3-Instruc...
2026.01
8.1
DoLa
Backbone=Phi-3-Instruc...
2026.01
6.7
LLaMA3-8B-Instruct
Backbone=LLaMA3-8B-Ins...
2026.01
0
DoLa
Backbone=LLaMA3-8B-Ins...
2026.01
0
TCR
Backbone=LLaMA3-8B-Ins...
2026.01
0
TCR-gold
Backbone=LLaMA3-8B-Ins...
2026.01
0
Feedback
Search any
task
Search any
task