3D Common Corruptions and Data Augmentation
About
We introduce a set of image transformations that can be used as corruptions to evaluate the robustness of models as well as data augmentation mechanisms for training neural networks. The primary distinction of the proposed transformations is that, unlike existing approaches such as Common Corruptions, the geometry of the scene is incorporated in the transformations -- thus leading to corruptions that are more likely to occur in the real world. We also introduce a set of semantic corruptions (e.g. natural object occlusions). We show these transformations are `efficient' (can be computed on-the-fly), `extendable' (can be applied on most image datasets), expose vulnerability of existing models, and can effectively make models more robust when employed as `3D data augmentation' mechanisms. The evaluations on several tasks and datasets suggest incorporating 3D information into benchmarking and training opens up a promising direction for robustness research.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Surface Normal Prediction | NYU V2 | Mean Error17.2 | 100 | |
| Surface Normal Estimation | DIODE (test) | L1 Error22.5 | 24 | |
| Surface Normal Estimation | ScanNet Normal Benchmark (test) | Angle Error Threshold (11.25°)60.2 | 18 | |
| Transparent object normal estimation | TransNormal Synthetic (test) | Mean Angular Error8.2 | 13 | |
| Transparent object normal estimation | ClearGrasp Synthetic (test) | Mean Angular Error33.8 | 13 | |
| Transparent object normal estimation | ClearPose Real-World (test) | Mean Angular Error51.7 | 13 | |
| Video Surface Normal Estimation | Sintel | Mean Angular Error40.5 | 12 | |
| Surface Normal Estimation | Taskonomy 2DCC (test) | L1 Error5.29 | 7 | |
| Surface Normal Estimation | Taskonomy 3DCC (test) | L1 Error5.35 | 7 | |
| Surface Normal Estimation | Taskonomy AE (test) | L1 Error4.94 | 7 |