Share your thoughts, 1 month free Claude Pro on usSee more

Code Repair on SWE-bench Lite

0.77r

ADARUBRIC-DA

Updated 4mo ago

Evaluation Results

Method	Links
ADARUBRIC-DA 2026.03		0.77	0.84	14.7
ADARUBRIC-WM 2026.03		0.72	0.82	12.4
GPT-4 Direct 2026.03		0.59	0.68	9.8
Prometheus 2026.03		0.56	0.7	9.1
G-Eval (GPT-4o) 2026.03		0.51	0.63	8.2