Rebalancing gradient to improve self-supervised co-training of depth, odometry and optical flow predictions
About
We present CoopNet, an approach that improves the cooperation of co-trained networks by dynamically adapting the apportionment of gradient, to ensure equitable learning progress. It is applied to motion-aware self-supervised prediction of depth maps, by introducing a new hybrid loss, based on a distribution model of photo-metric reconstruction errors made by, on the one hand the depth + odometry paired networks, and on the other hand the optical flow network. This model essentially assumes that the pixels from moving objects (that must be discarded for training depth and odometry), correspond to those where the two reconstructions strongly disagree. We justify this model by theoretical considerations and experimental evidences. A comparative evaluation on KITTI and CityScapes datasets shows that CoopNet improves or is comparable to the state-of-the-art in depth, odometry and optical flow predictions.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Depth Estimation | KITTI | RMSE4.824 | 156 | |
| Optical Flow | KITTI 2015 (test) | -- | 109 | |
| Monocular Depth Estimation | Cityscapes | Accuracy (delta < 1.25)84.6 | 74 | |
| Monocular Depth Estimation | KITTI 2015 (test) | Abs Rel0.125 | 22 | |
| Odometry estimation | KITTI Odometry Sequence 09 | -- | 14 | |
| Visual Odometry | KITTI Odometry Seq. 09 | t_err8.42 | 12 | |
| Odometry | KITTI Odometry Sequence 10 | Translational Error (%)7.29 | 9 | |
| Depth Estimation | Cityscapes | Abs Rel0.121 | 4 |