Deep Depth from Focus with Differential Focus Volume
About
Depth-from-focus (DFF) is a technique that infers depth using the focus change of a camera. In this work, we propose a convolutional neural network (CNN) to find the best-focused pixels in a focal stack and infer depth from the focus estimation. The key innovation of the network is the novel deep differential focus volume (DFV). By computing the first-order derivative with the stacked features over different focal distances, DFV is able to capture both the focus and context information for focus analysis. Besides, we also introduce a probability regression mechanism for focus estimation to handle sparsely sampled focal stacks and provide uncertainty estimation to the final prediction. Comprehensive experiments demonstrate that the proposed model achieves state-of-the-art performance on multiple datasets with good generalizability and fast speed.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Depth Estimation | NYU Depth V2 | RMSE0.136 | 177 | |
| Depth Estimation | NYUv2 1 (test) | RMSE0.232 | 19 | |
| Depth-from-Defocus | NYUv2 (test) | Delta 1 Threshold96.7 | 17 | |
| Depth Prediction | Synthetic (test) | Delta 1 Accuracy51.8 | 9 | |
| Depth Estimation | ARKitScenes (val) | RMSE0.43 | 7 | |
| Shape-from-focus | FoD (test) | Params (M)19.5 | 7 | |
| Depth Estimation | DDFF-12 (val) | MSE5.70e-4 | 6 | |
| Depth Estimation | FoD500 (test) | MSE0.0188 | 6 | |
| Depth Estimation | FT (test) | MAE5.51 | 6 | |
| Depth Estimation | DDFF12 | MSE5.70e-4 | 6 |