Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Reasoning on LiveCodeBench 1.0 (test)
Loading...
47.2
Accuracy
A3PO
10.176
19.788
29.4
39.012
Dec 25, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
A3PO
Model=Deepseek-R1-Dist...
2025.12
47.2
Lp-Reg
Model=Deepseek-R1-Dist...
2025.12
44.7
W-REINFORCE
Model=Deepseek-R1-Dist...
2025.12
44.6
DAPO w/ Fork Tokens
Model=Deepseek-R1-Dist...
2025.12
44.1
DAPO
Model=Deepseek-R1-Dist...
2025.12
43.2
GRPO
Model=Deepseek-R1-Dist...
2025.12
42.5
A3PO
Model=Qwen3-8B-Base
2025.12
33.8
DAPO w/ Fork Tokens
Model=Qwen3-8B-Base
2025.12
31.2
Lp-Reg
Model=Qwen3-8B-Base
2025.12
30.9
W-REINFORCE
Model=Qwen3-8B-Base
2025.12
30.4
DAPO
Model=Qwen3-8B-Base
2025.12
29.7
GRPO
Model=Qwen3-8B-Base
2025.12
29.4
A3PO
Model=Qwen2.5-7B-Math
2025.12
16.4
DAPO w/ Fork Tokens
Model=Qwen2.5-7B-Math
2025.12
14.3
W-REINFORCE
Model=Qwen2.5-7B-Math
2025.12
13.8
Lp-Reg
Model=Qwen2.5-7B-Math
2025.12
13.8
DAPO
Model=Qwen2.5-7B-Math
2025.12
12.4
GRPO
Model=Qwen2.5-7B-Math
2025.12
11.6
Feedback
Search any
task
Search any
task