Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on GSM8K (test) (Accuracy, Time (s))
Loading...
80.97
Accuracy
STP
77.6628
78.5214
79.38
80.2386
Feb 9, 2026
Accuracy
Latency (s)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Latency (s)
STP
Base model=LLaDA-8B-In...
2026.02
80.97
144,536
GRPO w/ ELBO
Base model=LLaDA-8B-In...
2026.02
80.52
163,035
Diffu-GRPO
Base model=LLaDA-8B-In...
2026.02
80.21
146,641
LLaDA-8B-Instruct
Base model=LLaDA-8B-In...
2026.02
77.79
-
Feedback
Search any
task
Search any
task