SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model
About
The computer vision community has developed numerous techniques for digitally restoring true scene information from single-view degraded photographs, an important yet extremely ill-posed task. In this work, we tackle image restoration from a different perspective by jointly denoising multiple photographs of the same scene. Our core hypothesis is that degraded images capturing a shared scene contain complementary information that, when combined, better constrains the restoration problem. To this end, we implement a powerful multi-view diffusion model that jointly generates uncorrupted views by extracting rich information from multi-view relationships. Our experiments show that our multi-view approach outperforms existing single-view image and even video-based methods on image deblurring and super-resolution tasks. Critically, our model is trained to output 3D consistent images, making it a promising tool for applications requiring robust multi-view integration, such as 3D reconstruction or pose estimation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Pose Estimation | ETH3D DA3 (test) | AUC@3020.94 | 12 | |
| 3D Reconstruction | HiRoom DA3 | F-score5.12 | 2 | |
| 3D Reconstruction | ETH3D DA3 | F-score11.72 | 2 | |
| 3D Reconstruction | DTU DA3 | Overall Metric Value8.101 | 2 | |
| 3D Reconstruction | 7Scenes DA3 | F-score0.00e+0 | 2 | |
| 3D Reconstruction | ScanNet++ DA3 | F-score12.74 | 2 | |
| Pose Estimation | HiRoom DA3 (test) | AUC@309.73 | 2 | |
| Pose Estimation | DTU DA3 (test) | AUC@3016.59 | 2 | |
| Pose Estimation | 7Scenes DA3 (test) | AUC@302.79 | 2 | |
| Pose Estimation | ScanNet++ DA3 (test) | AUC@3030.02 | 2 |