Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cooperative Cooking on Overcooked-AI Cramped Room
Loading...
188
J-Score
Diagnostic-Grounded Search
146.4
157.2
168
178.8
Mar 25, 2026
J-Score
Successful Deliveries
Invalid Deliveries
Updated 24d ago
Evaluation Results
Method
Method
Links
J-Score
Successful Deliveries
Invalid Deliveries
Diagnostic-Grounded Search
Generation=Gen 2
2026.03
188
9.4
0.75
Diagnostic-Grounded Search
Generation=Gen 1
2026.03
180
9
0.25
MAPPO
Generation=Baseline
2026.03
148
7.4
1.15
Feedback
Search any
task
Search any
task