Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Codebase QA on SWE-QA (test)

80.28Score

GPT-4.1-mini

33.854445.907257.9670.0128Jan 29, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
80.28
2026.01
78.05
2026.01
73.36
2026.01
73.09
2026.01
65.48
2026.01
65.19
2026.01
61.79
2026.01
57.3
2026.01
35.64