Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Repository Understanding on SWE-bench Verified (test)
Loading...
6.34
Steps
RPG-Encoder
5.1164
13.3757
21.635
29.8943
Feb 2, 2026
Steps
Cost ($)
Effectiveness Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Steps
Cost ($)
Effectiveness Score
RPG-Encoder
Backbone=GPT-5
2026.02
6.34
0.22
4.15
LocAgent
Backbone=GPT-5
2026.02
6.48
0.49
1.64
RPG-Encoder
Backbone=GPT-4.1
2026.02
6.75
0.18
4.63
LocAgent
Backbone=GPT-4.1
2026.02
11.94
0.86
0.76
CoSIL
Backbone=GPT-5
2026.02
19.52
0.31
2.64
CoSIL
Backbone=GPT-4.1
2026.02
19.77
0.24
3.1
OrcaLoca
Backbone=GPT-4.1
2026.02
20.22
0.46
1.48
OrcaLoca
Backbone=GPT-5
2026.02
36.93
0.75
1.16
Feedback
Search any
task
Search any
task