Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Software Engineering on SWE Verified MEDIUM reasoning
Loading...
53.3
Overall Score
HarmonyAgent
53.196
53.223
53.25
53.277
Apr 1, 2026
Overall Score
95% CI Lower Bound
Updated 17d ago
Evaluation Results
Method
Method
Links
Overall Score
95% CI Lower Bound
HarmonyAgent
Evaluation Harness=Har...
2026.04
53.3
49.3
gpt-oss-20b
Evaluation Harness=Ope...
2026.04
53.2
-
Feedback
Search any
task
Search any
task