Share your thoughts, 1 month free Claude Pro on usSee more

Agentic Coding on Multi-SWE-Bench

44.3Pass@1

Claude Sonnet-4.5

Updated 4mo ago

Evaluation Results

Method	Links
Claude Sonnet-4.5 2026.03		44.3
Gemini 3-pro 2026.03		42.7
Seed1.8 2026.03		42
GPT-5 High 2026.03		41.7
Gemini 2.5-pro 2026.03		20.7