RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching
About
We introduce RAFT-Stereo, a new deep architecture for rectified stereo based on the optical flow network RAFT. We introduce multi-level convolutional GRUs, which more efficiently propagate information across the image. A modified version of RAFT-Stereo can perform accurate real-time inference. RAFT-stereo ranks first on the Middlebury leaderboard, outperforming the next best method on 1px error by 29% and outperforms all published work on the ETH3D two-view stereo benchmark. Code is available at https://github.com/princeton-vl/RAFT-Stereo.
Lahav Lipson, Zachary Teed, Jia Deng• 2021
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Stereo Matching | KITTI 2015 (test) | D1 Error (Overall)1.82 | 144 | |
| Stereo Matching | KITTI 2015 | D1 Error (All)1.82 | 118 | |
| Optical Flow Estimation | Sintel Final (test) | -- | 101 | |
| Optical Flow | KITTI 2015 (test) | -- | 95 | |
| Stereo Matching | KITTI 2012 | Error Rate (3px, Noc)1.3 | 81 | |
| Stereo Matching | KITTI 2012 (test) | Outlier Rate (3px, Noc)1.3 | 76 | |
| Stereo Matching | ETH3D | bad 1.00.033 | 51 | |
| Stereo Matching | Middlebury (test) | -- | 47 | |
| Stereo Matching | Scene Flow | EPE (px)0.72 | 40 | |
| Stereo Matching | KITTI 2015 (all pixels) | D1 Error (Background)1.58 | 38 |
Showing 10 of 54 rows