Non-Local Video Denoising by CNN
About
Non-local patch based methods were until recently state-of-the-art for image denoising but are now outperformed by CNNs. Yet they are still the state-of-the-art for video denoising, as video redundancy is a key factor to attain high denoising performance. The problem is that CNN architectures are hardly compatible with the search for self-similarities. In this work we propose a new and efficient way to feed video self-similarities to a CNN. The non-locality is incorporated into the network via a first non-trainable layer which finds for each patch in the input image its most similar patches in a search region. The central values of these patches are then gathered in a feature vector which is assigned to each image pixel. This information is presented to a CNN which is trained to predict the clean image. We apply the proposed architecture to image and video denoising. For the latter patches are searched for in a 3D spatio-temporal volume. The proposed architecture achieves state-of-the-art results. To the best of our knowledge, this is the first successful application of a CNN to video denoising.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Denoising | Set8 | PSNR37.28 | 136 | |
| Video Denoising | Set8 (test) | PSNR37.1 | 127 | |
| Video Denoising | DAVIS 2017 | PSNR39.56 | 51 | |
| Video Denoising | DAVIS (test) | -- | 37 |