Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-Agent Collaboration Role Overstepping on SWE easy (dev)
Loading...
0.4
Overstepping Rate (<INFO>)
ChatDev
-1.392
10.704
22.8
34.896
Apr 3, 2026
Overstepping Rate (<INFO>)
Overstepping Rate (INFO)
Delta (%) (<INFO>)
Delta (%) (INFO)
Updated 13d ago
Evaluation Results
Method
Method
Links
Overstepping Rate (<INFO>)
Overstepping Rate (INFO)
Delta (%) (<INFO>)
Delta (%) (INFO)
ChatDev
CEO Configuration=FT,...
2026.04
0.4
0.4
-44.8
-46
ChatDev
CEO Configuration=FT,...
2026.04
0.8
0.8
-44.4
-45.6
ChatDev
CEO Configuration=Base...
2026.04
16.8
16.8
-28.4
-29.6
ChatDev
CEO Configuration=Base...
2026.04
45.2
46.4
-
-
Feedback
Search any
task
Search any
task