Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Arithmetic Reasoning on Countdown 256 tokens
Loading...
71.1
Pass@1
d-TreeRPO
17.436
31.368
45.3
59.232
Dec 10, 2025
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
d-TreeRPO
Base Model=LLaDA-8B-In...
2025.12
71.1
d-TreeRPO
Base Model=LLaDA-MoE-7...
2025.12
67.2
GDPO
Base Model=LLaDA-8B-In...
2025.12
64.1
GDPO
Base Model=LLaDA-MoE-7...
2025.12
58.1
wd1
Base Model=LLaDA-MoE-7...
2025.12
56.6
SAPO
Base Model=LLaDA-MoE-7...
2025.12
54.2
TraceRL
Base Model=LLaDA-MoE-7...
2025.12
54.2
SAPO
Base Model=LLaDA-8B-In...
2025.12
52
wd1
Base Model=LLaDA-8B-In...
2025.12
51.2
TraceRL
Base Model=LLaDA-8B-In...
2025.12
50.4
Diffu-GRPO
Base Model=LLaDA-MoE-7...
2025.12
50.1
LLaDA-MoE-7BA1B-Instruct
Base Model=LLaDA-MoE-7...
2025.12
42.6
Diffu-GRPO
Base Model=LLaDA-8B-In...
2025.12
31.3
VRPO (LLaDA-1.5)
Base Model=LLaDA-8B-In...
2025.12
22.3
LLaDA-8B-Instruct
Base Model=LLaDA-8B-In...
2025.12
19.5
Feedback
Search any
task
Search any
task