Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Automated Program Repair on Defects4J 835 bugs v2.0
Loading...
24.8
Pass@1
BOOSTAPR (+ Rline)
5.04
10.17
15.3
20.43
May 9, 2026
Pass@1
Updated 22d ago
Evaluation Results
Method
Method
Links
Pass@1
BOOSTAPR (+ Rline)
Params=32B
2026.05
24.8
+ Stage III (PPO, Rseq only)
Params=32B
2026.05
19.2
RepairLLaMA
Backbone=CodeLlama-7B,...
2026.05
17.2
SWE-RL
Backbone=Llama-3-70B,...
2026.05
16.8
SWE-Fixer
Backbone=Qwen2.5-72B,...
2026.05
15.2
+ Stage I (SFT)
Params=32B
2026.05
14.9
Lingma SWE-GPT
Backbone=Qwen2.5-72B,...
2026.05
14.5
ChatRepair
Backbone=GPT-3.5-turbo
2026.05
14.4
SWE-Gym
Backbone=Qwen2.5-Coder...
2026.05
13.1
Agentless
Backbone=GPT-4o
2026.05
12.4
Qwen2.5-Coder-32B (base)
Params=32B
2026.05
11.3
SWE-agent
Backbone=Claude 3.5 So...
2026.05
10.8
AutoCodeRover
Backbone=GPT-4o
2026.05
9.6
RLEF
Backbone=Llama-3-8B, P...
2026.05
8.4
KNOD
Backbone=CodeT5-base,...
2026.05
6
CodeRL
Backbone=CodeT5-large,...
2026.05
5.8
Feedback
Search any
task
Search any
task