LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry
About
Visual odometry estimates the motion of a moving camera based on visual input. Existing methods, mostly focusing on two-view point tracking, often ignore the rich temporal context in the image sequence, thereby overlooking the global motion patterns and providing no assessment of the full trajectory reliability. These shortcomings hinder performance in scenarios with occlusion, dynamic objects, and low-texture areas. To address these challenges, we present the Long-term Effective Any Point Tracking (LEAP) module. LEAP innovatively combines visual, inter-track, and temporal cues with mindfully selected anchors for dynamic track estimation. Moreover, LEAP's temporal probabilistic formulation integrates distribution updates into a learnable iterative refinement module to reason about point-wise uncertainty. Based on these traits, we develop LEAP-VO, a robust visual odometry system adept at handling occlusions and dynamic scenes. Our mindful integration showcases a novel practice by employing long-term point tracking as the front-end. Extensive experiments demonstrate that the proposed pipeline significantly outperforms existing baselines across various visual odometry benchmarks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Camera pose estimation | Sintel | ATE0.089 | 92 | |
| Camera pose estimation | ScanNet | ATE RMSE (Avg.)0.07 | 61 | |
| Camera pose estimation | TUM-dynamic | ATE0.068 | 19 | |
| Camera pose estimation | Sintel 14-sequence | ATE8.9 | 15 | |
| Visual Localization | 360SPR Pinhole (unseen) | TE (m)4.47 | 14 | |
| Scene Pose Regression | 360SPR 1.0 (unseen) | Median Translation Error (m)3.89 | 13 | |
| Scene Pose Regression | 360SPR 1.0 (seen) | Median Translation Error (m)3.77 | 13 | |
| Visual Localization | 360Loc cross-validation (unseen) | Median Translation Error (m)2.71 | 13 | |
| Visual Localization | 360Loc official (seen) | Median Translation Error (m)2.54 | 13 | |
| Camera pose estimation | MPI Sintel | ATE (m)0.037 | 11 |