Deep Depth from Focus with Differential Focus Volume
About
Depth-from-focus (DFF) is a technique that infers depth using the focus change of a camera. In this work, we propose a convolutional neural network (CNN) to find the best-focused pixels in a focal stack and infer depth from the focus estimation. The key innovation of the network is the novel deep differential focus volume (DFV). By computing the first-order derivative with the stacked features over different focal distances, DFV is able to capture both the focus and context information for focus analysis. Besides, we also introduce a probability regression mechanism for focus estimation to handle sparsely sampled focal stacks and provide uncertainty estimation to the final prediction. Comprehensive experiments demonstrate that the proposed model achieves state-of-the-art performance on multiple datasets with good generalizability and fast speed.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Depth Estimation | NYU Depth V2 | RMSE0.136 | 209 | |
| Depth Estimation | iBims | Abs Rel Error9.6 | 21 | |
| Depth Estimation | NYUv2 1 (test) | RMSE0.232 | 19 | |
| Depth-from-Defocus | NYUv2 (test) | Delta 1 Threshold96.7 | 17 | |
| Depth Estimation | FT | MAE5.509 | 12 | |
| Depth Estimation | FOD | MAE0.077 | 12 | |
| Depth Estimation | ZEDD (test) | Delta Accuracy (Thresh=1.05)15.3 | 10 | |
| Depth Estimation | Infinigen Defocus | Accuracy (delta 1.05)5.3 | 10 | |
| Depth-from-Defocus | DDFF | MSE5.70e-4 | 9 | |
| Depth Prediction | Synthetic (test) | Delta 1 Accuracy51.8 | 9 |