Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dafny Code Synthesis on APPS Vericoding-derived (test)
Loading...
83
Pass Rate
Model union
35.056
47.503
59.95
72.397
May 29, 2026
Pass Rate
Updated 2d ago
Evaluation Results
Method
Method
Links
Pass Rate
Model union
Attempts per task=5
2026.05
83
GPT-5 mini
Attempts per task=5
2026.05
71.6
Claude Opus 4.1
Attempts per task=5
2026.05
66.6
Gemini 2.5 Flash
Attempts per task=5
2026.05
36.9
Feedback
Search any
task
Search any
task