Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Codebase QA on SWE-Atlas
Loading...
32.25
Score
LARGERFixed
25.5524
27.2912
29.03
30.7688
May 8, 2026
Score
Updated 15d ago
Evaluation Results
Method
Method
Links
Score
LARGERFixed
Backbone LLM=GPT-5.2
2026.05
32.25
Claude Code*
Backbone LLM=Claude Op...
2026.05
31.2
Codex
Backbone LLM=GPT-5.2
2026.05
29.83
mini-swe-agent
Backbone LLM=GPT-5.2
2026.05
25.81
Feedback
Search any
task
Search any
task