Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

APPS

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code GenerationAPPS
Pass@191.2
69
Code GenerationAPPS (test)
Introductory Score56.3
36
Code GenerationAPPS Intermediate
Pass Rate81.95
32
Code Safety EvaluationAPPS 1.0 (test)
Safety Score0.988
30
Code GenerationAPPS Introductory
PR85.18
21
Code GenerationAPPS Competition
Accuracy69.66
20
Code GenerationAPPS Overall
PR21.38
18
Code GenerationAPPS
Precision Rate60.33
12
Program SynthesisAPPS 1.0 (test)
Pass@5 (Introductory)25.61
11
Code metric regressionAPPS Leetcode (test)
RMSE0.474
6
Coding ReasoningApps
Pass Rate68.3
5
Program SynthesisAPPS
Pass@5 (Introductory)25.61
5
Code GenerationAPPS Interview
Pass@12.64
5
Code GenerationAPPS
Avg@833.7
4
Code Generation OversightAPPS
Safety Score63
4
Program RepairAPPS (test)
Strict Accuracy21.7
4
Program DiscriminationAPPS (test)
Accuracy42.9
4
Code GenerationAPPS stdin-style Plus
Syntax Validity83.4
3
Showing 18 of 18 rows