Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

About

World models provide a powerful framework for simulating environment dynamics conditioned on actions or instructions, enabling downstream tasks such as action planning or policy learning. Recent approaches leverage world models as learned simulators, but its application to decision-time planning remains computationally prohibitive for real-time control. A key bottleneck lies in latent representations: conventional tokenizers encode each observation into hundreds of tokens, making planning both slow and resource-intensive. To address this, we propose CompACT, a discrete tokenizer that compresses each observation into as few as 8 tokens, drastically reducing computational cost while preserving essential information for planning. An action-conditioned world model that occupies CompACT tokenizer achieves competitive planning performance with orders-of-magnitude faster planning, offering a practical step toward real-world deployment of world models.

Dongwon Kim, Gawon Seo, Jinsung Lee, Minsu Cho, Suha Kwak• 2026

Related benchmarks

TaskDatasetResultRank
Image ReconstructionImageNet (val)
rFID2.4
95
Goal Conditioned Visual NavigationSCAND
ATE1.358
18
Goal Conditioned Visual NavigationRECON
ATE1.33
16
Robotic ManipulationRobomimic Lift
Success Rate56
14
Action-conditioned Video PredictionRoboNet
APE0.1122
2
Inverse Dynamics ModelingRoboNet
L1 Error0.091
2
Showing 6 of 6 rows

Other info

GitHub

Follow for update