Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models
About
Traffic simulation aims to learn a policy for traffic agents that, when unrolled in closed-loop, faithfully recovers the joint distribution of trajectories observed in the real world. Inspired by large language models, tokenized multi-agent policies have recently become the state-of-the-art in traffic simulation. However, they are typically trained through open-loop behavior cloning, and thus suffer from covariate shift when executed in closed-loop during simulation. In this work, we present Closest Among Top-K (CAT-K) rollouts, a simple yet effective closed-loop fine-tuning strategy to mitigate covariate shift. CAT-K fine-tuning only requires existing trajectory data, without reinforcement learning or generative adversarial imitation. Concretely, CAT-K fine-tuning enables a small 7M-parameter tokenized traffic simulation policy to outperform a 102M-parameter model from the same model family, achieving the top spot on the Waymo Sim Agent Challenge leaderboard at the time of submission. The code is available at https://github.com/NVlabs/catk.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-agent trajectory simulation | Waymo Open Sim Agents Challenge (WOSAC) 2024 (test) | minADE1.3065 | 28 | |
| Motion Simulation | Waymo Open Sim Agents Challenge 2025 | Realism Score78.46 | 14 | |
| Traffic Simulation | WOSAC (Waymo Open Sim Agents Challenge) leaderboard latest (test) | RMM78.46 | 11 | |
| Traffic Simulation | Waymo Open Motion Dataset (WOMD) 1.1 (test) | RMM Score76.87 | 10 | |
| Traffic Simulation | WOSAC 2024 (val) | RMM77.09 | 8 | |
| Interactive Trajectory Generation | Waymo Open Motion Dataset (WOMD) Interactive (test) | Collision Rate (CR)2.1 | 5 | |
| Open-loop motion behavior modeling | WOMD (2% val) | Collision Rate4.38 | 5 | |
| Motion Behavior Generation | WOMD Top-10% Safety-Critical | Collision Rate24.22 | 4 | |
| Motion Behavior Modeling | WOMD Overall-3000 | Collision Rate0.039 | 4 | |
| Simulating Agents | Waymo Sim Agents Challenge (WOSAC) 2025 (test) | Realism Score0.7846 | 3 |