| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| CNOT minimization | Overcooked setting circuits | Avg CNOT Count3.52 | 26 | |
| Human-Agent Coordination | Overcooked Multi-strategy Counter (human evaluation) | Average Score93.09 | 9 | |
| Human-Agent Coordination | Overcooked Counter Circuit (human evaluation) | Average Score91.11 | 9 | |
| Multi-agent Coordination | Overcooked 1.0 (Extended Evaluation 10 rounds x 25 episodes) | Success Rate93.5 | 5 | |
| Multi-agent Coordination | Overcooked Short evaluation (10 rounds x 5 episodes) 1.0 | Success Rate90 | 5 | |
| Teammate-type classification | Overcooked Coordination Ring layout | Classification Accuracy96 | 5 | |
| Teammate-type classification | Overcooked Asymmetric Advantage layout | Classification Accuracy77 | 5 | |
| Teammate-type classification | Overcooked Cramped Room layout | Accuracy96 | 5 | |
| Human-AI Collaboration | Overcooked Cramped Room | Reward91.33 | 4 | |
| Human-AI Collaboration | Overcooked Coordination Ring | Reward8.4 | 4 | |
| Human-AI Collaboration | Overcooked Asymmetric Advantages | Reward72 | 4 | |
| Multi-agent task completion | Overcooked Hard novel maps (seen tasks) | Completion Rate56.3 | 3 | |
| Multi-agent task completion | Overcooked Medium novel maps (seen tasks) | Completion Rate97.5 | 3 | |
| Multi-agent task completion | Overcooked Easy novel maps (seen tasks) | Completion Rate100 | 3 | |
| Multi-agent coordination | Overcooked unseen tasks, hard difficulty | Completion Rate43.7 | 2 | |
| Multi-agent coordination | Overcooked unseen tasks, medium difficulty | Completion Rate1 | 2 | |
| Human-AI Coordination | Overcooked Asymmetric advantages (test) | Metric- | 0 | |
| Human-AI Coordination | Overcooked Coordination ring (test) | Metric- | 0 | |
| Human-AI Coordination | Overcooked Cramped room (test) | Metric- | 0 | |
| Imitation Learning | Overcooked cramped room | Metric- | 0 | |
| Multi-agent cooperation | Overcooked | Metric- | 0 |