Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

NeRF-VO: Real-Time Sparse Visual Odometry with Neural Radiance Fields

About

We introduce a novel monocular visual odometry (VO) system, NeRF-VO, that integrates learning-based sparse visual odometry for low-latency camera tracking and a neural radiance scene representation for fine-detailed dense reconstruction and novel view synthesis. Our system initializes camera poses using sparse visual odometry and obtains view-dependent dense geometry priors from a monocular prediction network. We harmonize the scale of poses and dense geometry, treating them as supervisory cues to train a neural implicit scene representation. NeRF-VO demonstrates exceptional performance in both photometric and geometric fidelity of the scene representation by jointly optimizing a sliding window of keyframed poses and the underlying dense geometry, which is accomplished through training the radiance field with volume rendering. We surpass SOTA methods in pose estimation accuracy, novel view synthesis fidelity, and dense reconstruction quality across a variety of synthetic and real-world datasets while achieving a higher camera tracking frequency and consuming less GPU memory.

Jens Naumann, Binbin Xu, Stefan Leutenegger, Xingxing Zuo• 2023

Related benchmarks

TaskDatasetResultRank
Monocular Visual OdometryVIVID Mean over sequences
ATE RMSE0.76
20
Monocular Visual OdometryVIVID in_rob_local
ATE RMSE0.05
18
Monocular Visual OdometryVIVID in_rob_global
ATE RMSE0.08
17
Monocular Visual OdometryVIVID in_unst_local
ATE RMSE0.04
17
Monocular Visual OdometryVIVID in_rob_dark
ATE RMSE0.05
16
Monocular Visual OdometryVIVID in_unst_global
ATE RMSE0.12
15
Monocular Visual OdometryVIVID in_agg_global
ATE RMSE0.16
14
Monocular Visual OdometryVIVID in_unst_dark
ATE RMSE0.09
13
Monocular Visual OdometryVIVID in_agg_dark
ATE RMSE0.1
12
Visual LocalizationChang'e-3 Real Flight Dataset (test)
Translational Error12.9
11
Showing 10 of 19 rows

Other info

Follow for update