Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Coding Reasoning on Taco
Loading...
48.4
Pass Rate
GDPO_2-obj
27.288
32.769
38.25
43.731
Jan 8, 2026
Pass Rate
Exceed Rate
Bug Rate
Updated 3d ago
Evaluation Results
Method
Method
Links
Pass Rate
Exceed Rate
Bug Rate
GDPO_2-obj
Reward Optimization Ob...
2026.01
48.4
10.8
36.2
GRPO_2-obj
Reward Optimization Ob...
2026.01
45.1
11.8
37.7
GDPO_3-obj
Reward Optimization Ob...
2026.01
45.1
10.6
28
GRPO_3-obj
Reward Optimization Ob...
2026.01
44.4
14.7
30
DeepSeek-R1-7B (Baseline)
Reward Optimization Ob...
2026.01
28.1
78
48.9
Feedback
Search any
task
Search any
task