Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GeoMotion: Rethinking Motion Segmentation via Latent 4D Geometry

About

Motion segmentation in dynamic scenes is highly challenging, as conventional methods heavily rely on estimating camera poses and point correspondences from inherently noisy motion cues. Existing statistical inference or iterative optimization techniques that struggle to mitigate the cumulative errors in multi-stage pipelines often lead to limited performance or high computational cost. In contrast, we propose a fully learning-based approach that directly infers moving objects from latent feature representations via attention mechanisms, thus enabling end-to-end feed-forward motion segmentation. Our key insight is to bypass explicit correspondence estimation and instead let the model learn to implicitly disentangle object and camera motion. Supported by recent advances in 4D scene geometry reconstruction (e.g., $\pi^3$), the proposed method leverages reliable camera poses and rich spatial-temporal priors, which ensure stable training and robust inference for the model. Extensive experiments demonstrate that by eliminating complex pre-processing and iterative refinement, our approach achieves state-of-the-art motion segmentation performance with high efficiency. The code is available at:https://github.com/zjutcvg/GeoMotion.

Xiankang He, Peile Lin, Ying Cui, Dongyan Guo, Chunhua Shen, Xiaoqin Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Unsupervised Video Object SegmentationDAVIS 2016
Jaccard Score84.5
32
Video Object SegmentationFBMS
J-Score72.5
25
Moving Object SegmentationDAVIS M 17
Jaccard Index (J)82.2
12
Moving Object SegmentationYTVOS M 19
Jaccard Index (J)65.4
8
Moving Object SegmentationST v2
Jaccard Index (J)77.3
7
Moving Object SegmentationDAVIS M 16
Jaccard Index (J)83.5
7
Motion SegmentationMoCA--
6
Showing 7 of 7 rows

Other info

Follow for update