TartanVO: A Generalizable Learning-based VO

About

We present the first learning-based visual odometry (VO) model, which generalizes to multiple datasets and real-world scenarios and outperforms geometry-based methods in challenging scenes. We achieve this by leveraging the SLAM dataset TartanAir, which provides a large amount of diverse synthetic data in challenging environments. Furthermore, to make our VO model generalize across datasets, we propose an up-to-scale loss function and incorporate the camera intrinsic parameters into the model. Experiments show that a single model, TartanVO, trained only on synthetic data, without any finetuning, can be generalized to real-world datasets such as KITTI and EuRoC, demonstrating significant advantages over the geometry-based methods on challenging trajectories. Our code is available at https://github.com/castacks/tartanvo.

Wenshan Wang, Yaoyu Hu, Sebastian Scherer• 2020

Related benchmarks

Task	Dataset	Result
Visual-Inertial Odometry	EuRoC (All sequences)	MH1 Error0.639	62
Visual Odometry	KITTI	KITTI Seq 03 Error2.7	45
Visual Odometry	TUM-RGBD	freiburg1/desk2 Error0.122	43
Camera pose estimation	TUM freiburg1	Rotation Error0.049	34
Camera pose estimation	Sintel 14-sequence	ATE23.8	24
Visual Odometry	TartanAir (test)	Error MH0002.12	19
Visual-Inertial Odometry	EuRoC MAV	Average Error0.68	14
Tracking	EuRoC Dataset	MH 01 Score63.9	13
Camera pose estimation	MPI Sintel	ATE (m)0.238	13
Monocular SLAM	EuRoC (test)	ATE Error (MH03)0.55	12

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord