A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation

About

Recent work has shown that optical flow estimation can be formulated as a supervised learning task and can be successfully solved with convolutional networks. Training of the so-called FlowNet was enabled by a large synthetically generated dataset. The present paper extends the concept of optical flow estimation via convolutional networks to disparity and scene flow estimation. To this end, we propose three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks. Our datasets are the first large-scale datasets to enable training and evaluating scene flow methods. Besides the datasets, we present a convolutional network for real-time disparity estimation that provides state-of-the-art results. By combining a flow and disparity estimation network and training it jointly, we demonstrate the first scene flow estimation with a convolutional network.

Nikolaus Mayer, Eddy Ilg, Philip H\"ausser, Philipp Fischer, Daniel Cremers, Alexey Dosovitskiy, Thomas Brox• 2015

Related benchmarks

Task	Dataset	Result
Stereo Matching	KITTI 2015 (test)	D1 Error (Overall)4.34	245
Stereo Matching	KITTI 2015	D1 Error (All)4.34	142
Stereo Matching	KITTI 2012 (test)	Outlier Rate (3px, Noc)4.11	105
Stereo Matching	Scene Flow (test)	EPE1.68	84
Disparity Estimation	KITTI 2015 (test)	D1 Error (bg, all)4.32	77
Stereo Matching	KITTI Noc 2015	D1 Error (Background)4.11	42
Stereo Matching	Scene Flow	EPE (px)1	40
Stereo Matching	KITTI 2012 (Noc)	Error Rate (>2px)7.38	26
Stereo Matching	KITTI 2012 (All split)	Error Rate (>2px)8.11	26
Disparity Estimation	Scene Flow (test)	EPE1.68	24

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord