Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cooperative Cooking on Overcooked-AI Cramped Room
Loading...
150
Total Mean Reward
PASD
35.6
65.3
95
124.7
May 23, 2026
Total Mean Reward
J-Score
Successful Deliveries
Invalid Deliveries
Updated 8d ago
Evaluation Results
Method
Method
Links
Total Mean Reward
J-Score
Successful Deliveries
Invalid Deliveries
PASD
Partner Type=Behaviour...
2026.05
150
-
-
-
FCP
Partner Type=Behaviour...
2026.05
118.75
-
-
-
HiPT
Partner Type=Behaviour...
2026.05
93.13
-
-
-
DIAYN
Partner Type=Behaviour...
2026.05
40
-
-
-
MAPPO
Generation=Baseline
2026.03
-
148
7.4
1.15
Diagnostic-Grounded Search
Generation=Gen 1
2026.03
-
180
9
0.25
Diagnostic-Grounded Search
Generation=Gen 2
2026.03
-
188
9.4
0.75
Feedback
Search any
task
Search any
task