Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cooperative Cooking on Overcooked-AI Coordination Ring
Loading...
153
J-Score
Diagnostic-Grounded Search
7.4
45.2
83
120.8
Mar 25, 2026
J-Score
Successful Deliveries
Invalid Deliveries
Updated 24d ago
Evaluation Results
Method
Method
Links
J-Score
Successful Deliveries
Invalid Deliveries
Diagnostic-Grounded Search
Generation=Gen 2
2026.03
153
7.65
0.85
Diagnostic-Grounded Search
Generation=Gen 1
2026.03
102
5.1
3.5
MAPPO
Generation=Baseline
2026.03
13
0.15
11.3
Feedback
Search any
task
Search any
task