Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Arithmetic Reasoning on Countdown 512 tokens
Loading...
62.1
Pass@1
d-TreeRPO
14.156
26.603
39.05
51.497
Dec 10, 2025
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
d-TreeRPO
Base Model=LLaDA-8B-In...
2025.12
62.1
d-TreeRPO
Base Model=LLaDA-MoE-7...
2025.12
60.6
GDPO
Base Model=LLaDA-8B-In...
2025.12
60.2
wd1
Base Model=LLaDA-MoE-7...
2025.12
58.7
SAPO
Base Model=LLaDA-8B-In...
2025.12
56.3
SAPO
Base Model=LLaDA-MoE-7...
2025.12
54.8
GDPO
Base Model=LLaDA-MoE-7...
2025.12
53.4
TraceRL
Base Model=LLaDA-8B-In...
2025.12
52.6
TraceRL
Base Model=LLaDA-MoE-7...
2025.12
49.1
Diffu-GRPO
Base Model=LLaDA-MoE-7...
2025.12
48.2
wd1
Base Model=LLaDA-8B-In...
2025.12
46.1
LLaDA-MoE-7BA1B-Instruct
Base Model=LLaDA-MoE-7...
2025.12
41.4
Diffu-GRPO
Base Model=LLaDA-8B-In...
2025.12
37.1
VRPO (LLaDA-1.5)
Base Model=LLaDA-8B-In...
2025.12
18
LLaDA-8B-Instruct
Base Model=LLaDA-8B-In...
2025.12
16
Feedback
Search any
task
Search any
task