Share your thoughts, 1 month free Claude Pro on usSee more

Coding on SWE-bench Lite (test)

25.83Accuracy

MAS-ZERO

Updated 4mo ago

Evaluation Results

Method	Links
MAS-ZERO 2025.05		25.83
MAS-ZERO 2025.05		16.74
AFlow 2025.05		16.25
Debate 2025.05		12.5
Self-Refine 2025.05		11.67
MaAS 2025.05		10
CoT 2025.05		9.17
Debate 2025.05		6.67
AFlow 2025.05		6.67
MaAS 2025.05		5
CoT 2025.05		2.92
Self-Refine 2025.05		1.67