Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cooperative Multi-agent Problem Solving on Newsgroups (Medium)
Loading...
1
Detected Error
MAS-Only
0.68
2.84
5
7.16
Feb 27, 2026
Detected Error
RMSE
Total Runtime
Valid Output Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Detected Error
RMSE
Total Runtime
Valid Output Rate
MAS-Only
Agents=1
2026.02
1
4.08
9.69
70
MAS+LLM Judge
Agents=1
2026.02
1
4.13
10.23
100
MAS+DIG
Agents=1
2026.02
1
4.17
9.84
90
MAS-Only
Agents=6
2026.02
1.67
3.48
27.42
100
MAS-Only
Agents=3
2026.02
2
3.59
27
100
MAS+LLM Judge
Agents=3
2026.02
4
3.76
24.53
70
MAS+DIG
Agents=6
2026.02
4
3.4
29.79
100
MAS+LLM Judge
Agents=6
2026.02
5.67
4.13
30.43
100
MAS+DIG
Agents=3
2026.02
9
3.23
47.63
90
Feedback
Search any
task
Search any
task