Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models

About

Traffic simulation aims to learn a policy for traffic agents that, when unrolled in closed-loop, faithfully recovers the joint distribution of trajectories observed in the real world. Inspired by large language models, tokenized multi-agent policies have recently become the state-of-the-art in traffic simulation. However, they are typically trained through open-loop behavior cloning, and thus suffer from covariate shift when executed in closed-loop during simulation. In this work, we present Closest Among Top-K (CAT-K) rollouts, a simple yet effective closed-loop fine-tuning strategy to mitigate covariate shift. CAT-K fine-tuning only requires existing trajectory data, without reinforcement learning or generative adversarial imitation. Concretely, CAT-K fine-tuning enables a small 7M-parameter tokenized traffic simulation policy to outperform a 102M-parameter model from the same model family, achieving the top spot on the Waymo Sim Agent Challenge leaderboard at the time of submission. The code is available at https://github.com/NVlabs/catk.

Zhejun Zhang, Peter Karkus, Maximilian Igl, Wenhao Ding, Yuxiao Chen, Boris Ivanovic, Marco Pavone• 2024

Related benchmarks

Task	Dataset	Result
Multi-agent trajectory simulation	Waymo Open Sim Agents Challenge (WOSAC) 2024 (test)	minADE1.3065	28
Traffic Simulation	Waymo Open Motion Dataset (WOMD) v2025 (private test)	RMM Score78.56	14
Motion Simulation	Waymo Open Sim Agents Challenge 2025	Realism Score78.46	14
Traffic Simulation	WOSAC (Waymo Open Sim Agents Challenge) leaderboard latest (test)	RMM78.46	11
Traffic Simulation	Waymo Open Motion Dataset (WOMD) 1.1 (test)	RMM Score76.87	10
Traffic Simulation	WOSAC 2024 (val)	RMM77.09	8
Interactive Trajectory Generation	Waymo Open Motion Dataset (WOMD) Interactive (test)	Collision Rate (CR)2.1	5
Open-loop motion behavior modeling	WOMD (2% val)	Collision Rate4.38	5
Motion Behavior Generation	WOMD Top-10% Safety-Critical	Collision Rate24.22	4
Motion Behavior Modeling	WOMD Overall-3000	Collision Rate0.039	4

Showing 10 of 11 rows

Other info

Code

Follow for update

@wizwand_team Discord