Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Program Synthesis on HumanEval-EvalPlus Standard (test)
Loading...
89.6
pass@1
QualityFlow
86.48
87.29
88.1
88.91
Jan 20, 2025
pass@1
Delta (Δ↑)
Updated 4d ago
Evaluation Results
Method
Method
Links
pass@1
Delta (Δ↑)
QualityFlow
LLM Backbone=Claude So...
2025.01
89.6
3
Instruct-Turbo
2025.01
86.6
-
Feedback
Search any
task
Search any
task