Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Synthetic vs. Real Training Data for Visual Navigation

About

This paper investigates how the performance of visual navigation policies trained in simulation compares to policies trained with real-world data. Performance degradation of simulator-trained policies is often significant when they are evaluated in the real world. However, despite this well-known sim-to-real gap, we demonstrate that simulator-trained policies can match the performance of their real-world-trained counterparts. Central to our approach is a navigation policy architecture that bridges the sim-to-real appearance gap by leveraging pretrained visual representations and runs real-time on robot hardware. Evaluations on a wheeled mobile robot show that the proposed policy, when trained in simulation, outperforms its real-world-trained version by 31 and the prior state-of-the-art methods by 50 points in navigation success rate. Policy generalization is verified by deploying the same model onboard a drone. Our results highlight the importance of diverse image encoder pretraining for sim-to-real generalization, and identify on-policy learning as a key advantage of simulated training over training with real data. Code, model checkpoints and multimedia materials are available at https://lasuomela.github.io/faint/

Lauri Suomela, Sasanka Kuruppu Arachchige, German F. Torres, Harry Edelman, Joni-Kristian K\"am\"ar\"ainen• 2025

Related benchmarks

TaskDatasetResultRank
Object Goal NavigationHM3D
Success Rate60.3
55
Backward Visual Navigation (To Start)Gibson
SR13.2
48
Backward Visual Navigation (To Start)HM3D
SR11.3
48
Forward Visual Navigation (To End)Gibson
SR50.7
48
Any-Point Visual NavigationGibson
SR52
24
Any-Point Visual NavigationHM3D
SR36.3
24
Forward Visual Navigation (To End)HM3D
Success Rate (SR)60.3
24
Showing 7 of 7 rows

Other info

Follow for update