Share your thoughts, 1 month free Claude Pro on usSee more

Agentic Coding on SWE-Bench Multilingual

71.7Accuracy

MiMo-V2-Flash

Updated 1mo ago

Evaluation Results

Method	Links
MiMo-V2-Flash 2026.01		71.7
DeepSeek-V3.2 Thinking 2026.01		70.2
Claude Sonnet 4.5 2026.01		68
Qwen3.6 2026.05		67.2
Kimi-K2 Thinking 2026.01		61.1
Qwen3.5 2026.05		60.3
LAGUNA XS.2 2026.05		57.7
Devstral Small 2 2026.05		55.7
GPT-5 High 2026.01		55.3
Gemma 4 2026.05		51.7
LongCat-Flash-Lite 2026.01		38.1
Kimi-Linear-48B-A3B 2026.01		37.2
Qwen3-Next-80B-A3B-Instruct 2026.01		31.3