SIRR-LMM: Single-image Reflection Removal via Large Multimodal Model

About

Glass surfaces create complex interactions of reflected and transmitted light, making single-image reflection removal (SIRR) challenging. Existing datasets suffer from limited physical realism in synthetic data or insufficient scale in real captures. We introduce a synthetic dataset generation framework that path-traces 3D glass models over real background imagery to create physically accurate reflection scenarios with varied glass properties, camera settings, and post-processing effects. To leverage the capabilities of Large Multimodal Model (LMM), we concatenate the image layers into a single composite input, apply joint captioning, and fine-tune the model using task-specific LoRA rather than full-parameter training. This enables our approach to achieve improved reflection removal and separation performance compared to state-of-the-art methods.

Yu Guo, Zhiqiang Lao, Xiyun Song, Yubin Zhou, Heather Yu• 2026

Related benchmarks

Task	Dataset	Result
Single Image Reflection Removal	Real 20 55 (test)	PSNR24.89	7
Single Image Reflection Removal	SIR^2 141 36 (test)	PSNR (Postcard)26.2	7
Single Image Reflection Removal	Nature 27 (test)	PSNR25.14	7
Single Image Reflection Removal	ReaL	Win Rate57.32	4
Single Image Reflection Removal	Nature	Win Rate35	4
Single Image Reflection Removal	Postcard	Win Rate67.03	4
Single Image Reflection Removal	SolidObject	Win Rate41.43	4
Single Image Reflection Removal	Wildscene	Win Rate48.57	4

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord