RAFT-3D: Scene Flow using Rigid-Motion Embeddings

About

We address the problem of scene flow: given a pair of stereo or RGB-D video frames, estimate pixelwise 3D motion. We introduce RAFT-3D, a new deep architecture for scene flow. RAFT-3D is based on the RAFT model developed for optical flow but iteratively updates a dense field of pixelwise SE3 motion instead of 2D motion. A key innovation of RAFT-3D is rigid-motion embeddings, which represent a soft grouping of pixels into rigid objects. Integral to rigid-motion embeddings is Dense-SE3, a differentiable layer that enforces geometric consistency of the embeddings. Experiments show that RAFT-3D achieves state-of-the-art performance. On FlyingThings3D, under the two-view evaluation, we improved the best published accuracy (d < 0.05) from 34.3% to 83.7%. On KITTI, we achieve an error of 5.77, outperforming the best published method (6.31), despite using no object instance supervision. Code is available at https://github.com/princeton-vl/RAFT-3D.

Zachary Teed, Jia Deng• 2020

Related benchmarks

Task	Dataset	Result
Optical Flow	KITTI 2015 (test)	Fl Error (All)4.29	109
Disparity Estimation	KITTI 2015 (test)	D1 Error (bg, all)1.48	77
Optical Flow	MPI Sintel (train)	EPE (Final)2.91	63
Scene Flow Estimation	FlyingThings3D with occlusions (F3Do) (test)	EPE3D0.064	28
Scene Flow	KITTI Scene Flow 2015 (test)	D1 Score (All)1.81	28
Scene Flow	KITTI Scene Flow (test)	D1 Error (all)1.81	25
Optical Flow	FlyingThings3D (val)	EPE2D2.37	15
Scene Flow	FlyingThings3D (val)	EPE3D0.062	14
Scene Flow	Event-KITTI Night	EPE0.104	10
Scene Flow Estimation	BlinkVision (val)	Abs Rel0.1426	7

Showing 10 of 23 rows

Other info

Code

Follow for update

@wizwand_team Discord