M2Retinexformer: Multi-Modal Retinexformer for Low-Light Image Enhancement

About

Low-light image enhancement is challenging due to complex degradations, including amplified noise, artifacts, and color distortion. While Retinex-based deep learning methods have achieved promising results, they primarily rely on single-modality RGB information. We propose M2Retinexformer (Multi-Modal Retinexformer), a novel framework that extends Retinexformer by incorporating depth cues, luminance priors, and semantic features within a progressive refinement pipeline. Depth provides geometric context that is invariant to lighting variations, while luminance and semantic features offer explicit guidance on brightness distribution and scene understanding. Modalities are extracted at multiple scales and fused through cross-attention, with adaptive gating dynamically balancing illumination-guided self-attention and cross-attention based on the reliability of auxiliary cues. Evaluations on the LOL, SID, SMID, and SDSD benchmarks demonstrate overall improvements over Retinexformer and recent state-of-the-art methods. Code and pretrained weights are available at https://github.com/YoussefAboelwafa/M2Retinexformer

Youssef Aboelwafa, Hicham G. Elmongui, Marwan Torki• 2026

Related benchmarks

Task	Dataset	Result
Low-light Image Enhancement	LOL v1	PSNR24.89	195
Low-light Image Enhancement	LOL real v2	PSNR23.85	164
Low-light Image Enhancement	LOL syn v2	PSNR27.12	161
Low-light Image Enhancement	SID	PSNR24.84	70
Low-light Image Enhancement	SMID	PSNR28.76	55
Low-light Image Enhancement	SDSD outdoor	PSNR27.39	49
Low-light Image Enhancement	SDSD indoor	PSNR30.48	37

Showing 7 of 7 rows

Other info

GitHub

Follow for update

@wizwand_team Discord