Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Coding Reasoning on Apps
Loading...
68.3
Pass Rate
GDPO_2-obj
26.492
37.346
48.2
59.054
Jan 8, 2026
Pass Rate
Exceed Rate
Bug Rate
Updated 3d ago
Evaluation Results
Method
Method
Links
Pass Rate
Exceed Rate
Bug Rate
GDPO_2-obj
Reward Optimization Ob...
2026.01
68.3
5
23.5
GRPO_3-obj
Reward Optimization Ob...
2026.01
68.1
11.2
20.3
GDPO_3-obj
Reward Optimization Ob...
2026.01
67.8
8.5
18.8
GRPO_2-obj
Reward Optimization Ob...
2026.01
67.2
5.2
25
DeepSeek-R1-7B (Baseline)
Reward Optimization Ob...
2026.01
28.1
73.9
32.9
Feedback
Search any
task
Search any
task