Ghost-free High Dynamic Range Imaging with Context-aware Transformer
About
High dynamic range (HDR) deghosting algorithms aim to generate ghost-free HDR images with realistic details. Restricted by the locality of the receptive field, existing CNN-based methods are typically prone to producing ghosting artifacts and intensity distortions in the presence of large motion and severe saturation. In this paper, we propose a novel Context-Aware Vision Transformer (CA-ViT) for ghost-free high dynamic range imaging. The CA-ViT is designed as a dual-branch architecture, which can jointly capture both global and local dependencies. Specifically, the global branch employs a window-based Transformer encoder to model long-range object movements and intensity variations to solve ghosting. For the local branch, we design a local context extractor (LCE) to capture short-range image features and use the channel attention mechanism to select informative local details across the extracted features to complement the global branch. By incorporating the CA-ViT as basic components, we further build the HDR-Transformer, a hierarchical network to reconstruct high-quality ghost-free HDR images. Extensive experiments on three benchmark datasets show that our approach outperforms state-of-the-art methods qualitatively and quantitatively with considerably reduced computational budgets. Codes are available at https://github.com/megvii-research/HDR-Transformer
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| HDR Imaging | Challenge123 (test) | PSNR-µ40.7 | 17 | |
| HDR deghosting | Kalantari dataset 6 (test) | PSNR-mu44.32 | 11 | |
| HDR deghosting | Hu dataset 4 (test) | PSNR-μ48.1 | 11 | |
| Multi-exposure Fusion | MEFB static 69 | MUSIQ63.1 | 11 | |
| Multi-exposure HDR image reconstruction | Challenge123 (test) | PSNR (PU21)41.9 | 10 | |
| HDR deghosting | 1000 x 1500 (test) | Latency (s)5.34 | 10 | |
| High Dynamic Range Imaging | SCT 1.0 (test) | PSNR (µ)42.39 | 9 | |
| HDR Imaging | SCT (test) | PSNR (µ)31.94 | 8 | |
| Multi-exposure Fusion | UltraFusion Benchmark v1 (test) | MUSIQ63.66 | 7 | |
| Multi-exposure Fusion | RealHDRV dynamic v1 | TMQI0.868 | 7 |