Bridging Unsupervised and Supervised Depth from Focus via All-in-Focus Supervision
About
Depth estimation is a long-lasting yet important task in computer vision. Most of the previous works try to estimate depth from input images and assume images are all-in-focus (AiF), which is less common in real-world applications. On the other hand, a few works take defocus blur into account and consider it as another cue for depth estimation. In this paper, we propose a method to estimate not only a depth map but an AiF image from a set of images with different focus positions (known as a focal stack). We design a shared architecture to exploit the relationship between depth and AiF estimation. As a result, the proposed method can be trained either supervisedly with ground truth depth, or \emph{unsupervisedly} with AiF images as supervisory signals. We show in various experiments that our method outperforms the state-of-the-art methods both quantitatively and qualitatively, and also has higher efficiency in inference time.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Depth Estimation | FOD | MAE0.071 | 12 | |
| Depth Estimation | FT | MAE6.04 | 12 | |
| Depth-from-Focus | DDFF (val) | MAE0.0028 | 8 | |
| Shape-from-focus | FoD (test) | Params (M)16.53 | 7 | |
| Depth Estimation | FT (test) | MAE6.81 | 6 | |
| Depth Estimation | FoD (test) | MAE0.071 | 5 | |
| Depth Estimation | DDFF 12 (test) | MSE8.60e-4 | 4 |