Low-Light Video Enhancement with An Effective Spatial-Temporal Decomposition Paradigm
About
Low-Light Video Enhancement (LLVE) seeks to restore dynamic or static scenes plagued by severe invisibility and noise. In this paper, we present an innovative video decomposition strategy that incorporates view-independent and view-dependent components to enhance the performance of LLVE. The framework is called View-aware Low-light Video Enhancement (VLLVE). We leverage dynamic cross-frame correspondences for the view-independent term (which primarily captures intrinsic appearance) and impose a scene-level continuity constraint on the view-dependent term (which mainly describes the shading condition) to achieve consistent and satisfactory decomposition results. To further ensure consistent decomposition, we introduce a dual-structure enhancement network featuring a cross-frame interaction mechanism. By supervising different frames simultaneously, this network encourages them to exhibit matching decomposition features. This mechanism can seamlessly integrate with encoder-decoder single-frame networks, incurring minimal additional parameter costs. Building upon VLLVE, we propose a more comprehensive decomposition strategy by introducing an additive residual term, resulting in VLLVE++. This residual term can simulate scene-adaptive degradations, which are difficult to model using a decomposition formulation for common scenes, thereby further enhancing the ability to capture the overall content of videos. In addition, VLLVE++ enables bidirectional learning for both enhancement and degradation-aware correspondence refinement (end-to-end manner), effectively increasing reliable correspondences while filtering out incorrect ones. Notably, VLLVE++ demonstrates strong capability in handling challenging cases, such as real-world scenes and videos with high dynamics. Extensive experiments are conducted on widely recognized LLVE benchmarks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Low-light Video Enhancement | SDSD indoor | PSNR29.78 | 18 | |
| Low-light Video Enhancement | SDSD outdoor | PSNR27.47 | 18 | |
| Low-light Video Enhancement | SMID | PSNR30.71 | 18 | |
| Low-light Video Enhancement | DID | PSNR31.06 | 18 | |
| Low-light Video Enhancement | 3D low-light dataset (test) | PSNR23.51 | 12 | |
| Low-light Video Enhancement | DAVIS | PSNR24.09 | 12 | |
| Low-light Video Enhancement | YouTube-VOS (test) | PSNR25.75 | 12 | |
| Low-light Video Enhancement | DAVIS | Metric Short Seq0.014 | 8 | |
| Low-light Video Enhancement | SDSD indoor | Short-Term Metric0.008 | 8 | |
| Low-light Video Enhancement | SDSD outdoor | Short Sequence Error0.006 | 8 |