Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on DeepMath 103K subset
Loading...
65.9
Accuracy
TTVS
43.228
49.114
55
60.886
Apr 9, 2026
Accuracy
Updated 8d ago
Evaluation Results
Method
Method
Links
Accuracy
TTVS
Backbone=Qwen3-4B
2026.04
65.9
TTRL
Backbone=Qwen3-4B
2026.04
62.8
Init
Backbone=Qwen3-4B
2026.04
53.2
TTVS
Backbone=LLaMA-3.2-3B-...
2026.04
49
TTRL
Backbone=LLaMA-3.2-3B-...
2026.04
46.6
Init
Backbone=LLaMA-3.2-3B-...
2026.04
44.1
Feedback
Search any
task
Search any
task