Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cooperative Multi-agent Problem Solving on Newsgroups (Easy)
Loading...
1
Detected Error
MAS-Only
0.68
2.84
5
7.16
Feb 27, 2026
Detected Error
RMSE
Total Runtime
Valid Output Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Detected Error
RMSE
Total Runtime
Valid Output Rate
MAS-Only
Agents=1
2026.02
1
2.4
11.47
100
MAS+LLM Judge
Agents=1
2026.02
1
2.32
12.57
100
MAS+DIG
Agents=1
2026.02
1
2.38
9.83
70
MAS-Only
Agents=3
2026.02
2
2.11
21.98
100
MAS-Only
Agents=6
2026.02
2
2.18
21.33
100
MAS+LLM Judge
Agents=3
2026.02
4
2.19
28.83
70
MAS+LLM Judge
Agents=6
2026.02
4
2.35
26.73
100
MAS+DIG
Agents=6
2026.02
8
2.01
48.12
100
MAS+DIG
Agents=3
2026.02
9
1.82
37.69
100
Feedback
Search any
task
Search any
task