Deep Bilateral Learning for Real-Time Image Enhancement
About
Performance is a critical challenge in mobile image processing. Given a reference imaging pipeline, or even human-adjusted pairs of images, we seek to reproduce the enhancements and enable real-time evaluation. For this, we introduce a new neural network architecture inspired by bilateral grid processing and local affine color transforms. Using pairs of input/output images, we train a convolutional neural network to predict the coefficients of a locally-affine model in bilateral space. Our architecture learns to make local, global, and content-dependent decisions to approximate the desired image transformation. At runtime, the neural network consumes a low-resolution version of the input image, produces a set of affine transformations in bilateral space, upsamples those transformations in an edge-preserving fashion using a new slicing node, and then applies those upsampled transformations to the full-resolution image. Our algorithm processes high-resolution images on a smartphone in milliseconds, provides a real-time viewfinder at 1080p resolution, and matches the quality of state-of-the-art approximation techniques on a large class of image operators. Unlike previous work, our model is trained off-line from data and therefore does not require access to the original operator at runtime. This allows our model to learn complex, scene-dependent transformations for which no reference implementation is available, such as the photographic edits of a human retoucher.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Enhancement | Image Enhancement Speed (test) | Running Time (ms)3.49 | 56 | |
| Image Enhancement | MIT-Adobe FiveK (test) | PSNR22.31 | 34 | |
| Photo Retouching | FiveK 480p resolution (test) | PSNR24.66 | 27 | |
| Image Enhancement | Adobe Five-K | PSNR24.66 | 22 | |
| Imaging pipeline enhancement | FiveK 480p | PSNR24.52 | 17 | |
| Tone Mapping | FiveK | PSNR24.52 | 15 | |
| Low-light Image Enhancement | LoL dataset | PSNR20.14 | 14 | |
| SDRTV-to-HDRTV conversion | HDRTV1K 1.0 (test) | PSNR35.73 | 14 | |
| Image Enhancement | MIT-Adobe-5K-UPE Expert C ground truth (test) | PSNR21.96 | 12 | |
| Image Enhancement | Adobe Five-K RAW format (test) | LPIPS0.08 | 11 |