A Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion Model and Occlusions

About

Motion estimation is one of the core challenges in computer vision. With traditional dual-frame approaches, occlusions and out-of-view motions are a limiting factor, especially in the context of environmental perception for vehicles due to the large (ego-) motion of objects. Our work proposes a novel data-driven approach for temporal fusion of scene flow estimates in a multi-frame setup to overcome the issue of occlusion. Contrary to most previous methods, we do not rely on a constant motion model, but instead learn a generic temporal relation of motion from data. In a second step, a neural network combines bi-directional scene flow estimates from a common reference frame, yielding a refined estimate and a natural byproduct of occlusion masks. This way, our approach provides a fast multi-frame extension for a variety of scene flow estimators, which outperforms the underlying dual-frame approaches.

Ren\'e Schuster, Christian Unger, Didier Stricker• 2020

Related benchmarks

Task	Dataset	Result
Optical Flow	KITTI 2015 (test)	Fl Error (All)7.67	122
Disparity Estimation	KITTI 2015 (test)	D1 Error (bg, all)2.08	77
Scene Flow	KITTI Scene Flow 2015 (test)	--	28

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord