Bootstrap Dynamic-Aware 3D Visual Representation for Scalable Robot Learning

About

Despite strong results on recognition and segmentation, current 3D visual pre-training methods often underperform on robotic manipulation. We attribute this gap to two factors: the lack of state-action-state dynamics modeling and the unnecessary redundancy of explicit geometric reconstruction. We introduce AFRO, a self-supervised framework that learns dynamics-aware 3D representations without action or reconstruction supervision. AFRO casts state prediction as a generative diffusion process and jointly models forward and inverse dynamics in a shared latent space to capture causal transition structure. To prevent feature leakage in action learning, we employ feature differencing and inverse-consistency supervision, improving the quality and stability of visual features. When combined with Diffusion Policy, AFRO substantially increases manipulation success rates across 16 simulated and 4 real-world tasks, outperforming existing pre-training approaches. The framework also scales favorably with data volume and task complexity. Qualitative visualizations indicate that AFRO learns semantically rich, discriminative features, offering an effective pre-training solution for 3D representation learning in robotics. Project page: https://kolakivy.github.io/AFRO/

Qiwei Liang, Boyang Cai, Minghao Lai, Sitong Zhuang, Tao Lin, Yan Qin, Yixuan Ye, Jiaming Liang, Renjing Xu• 2025

Related benchmarks

Task	Dataset	Result
Robot Manipulation	Adroit	Pen Task Score84	50
Robot Manipulation	MetaWorld	Success Rate (Easy)88	17
Robot Manipulation Aggregate	Franka Manipulation Real-World (Evaluation)	Mean Success Rate84	16
Bell Pressing	Franka Real-World Manipulation (Evaluation)	Success Rate90	9
Block-to-Block Alignment	Franka Manipulation Real-World (Evaluation)	Success Rate85	9
Cover Block	Franka Real-World Manipulation (Evaluation)	Success Rate85	9
Fruit Pick-and-Place	Franka Manipulation Real-World (Evaluation)	Success Rate75	9
Robot Manipulation	Meta-World-5 25-demonstration budget v1 (test)	Bin Picking Success Rate20	5

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord