Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reasoning on MOAS (test)
Loading...
95.8
Accuracy
TCR-gold
-2.792
22.804
48.4
73.996
Jan 29, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
TCR-gold
Backbone=Qwen3-8B-Inst...
2026.01
95.8
TCR
Backbone=Qwen3-8B-Inst...
2026.01
93.6
Qwen3-8B-Instruct
Backbone=Qwen3-8B-Inst...
2026.01
92.7
DoLa
Backbone=Qwen3-8B-Inst...
2026.01
89.7
TCR-gold
Backbone=Qwen2.5-7B-In...
2026.01
62
TCR-gold
Backbone=LLaMA3-8B-Ins...
2026.01
47
TCR
Backbone=Qwen2.5-7B-In...
2026.01
46
DoLa
Backbone=Qwen2.5-7B-In...
2026.01
40
TCR
Backbone=LLaMA3-8B-Ins...
2026.01
39.4
Qwen2.5-7B-Instruct
Backbone=Qwen2.5-7B-In...
2026.01
39.2
DoLa
Backbone=LLaMA3-8B-Ins...
2026.01
29.8
LLaMA3-8B-Instruct
Backbone=LLaMA3-8B-Ins...
2026.01
22.9
TCR
Backbone=Phi-3-Instruc...
2026.01
4
TCR-gold
Backbone=Phi-3-Instruc...
2026.01
4
Phi-3-Instruct
Backbone=Phi-3-Instruc...
2026.01
1
DoLa
Backbone=Phi-3-Instruc...
2026.01
1
Feedback
Search any
task
Search any
task