LRRNet: A Novel Representation Learning Guided Fusion Network for Infrared and Visible Images

About

Deep learning based fusion methods have been achieving promising performance in image fusion tasks. This is attributed to the network architecture that plays a very important role in the fusion process. However, in general, it is hard to specify a good fusion architecture, and consequently, the design of fusion networks is still a black art, rather than science. To address this problem, we formulate the fusion task mathematically, and establish a connection between its optimal solution and the network architecture that can implement it. This approach leads to a novel method proposed in the paper of constructing a lightweight fusion network. It avoids the time-consuming empirical network design by a trial-and-test strategy. In particular we adopt a learnable representation approach to the fusion task, in which the construction of the fusion network architecture is guided by the optimisation algorithm producing the learnable model. The low-rank representation (LRR) objective is the foundation of our learnable model. The matrix multiplications, which are at the heart of the solution are transformed into convolutional operations, and the iterative process of optimisation is replaced by a special feed-forward network. Based on this novel network architecture, an end-to-end lightweight fusion network is constructed to fuse infrared and visible light images. Its successful training is facilitated by a detail-to-semantic information loss function proposed to preserve the image details and to enhance the salient features of the source images. Our experiments show that the proposed fusion network exhibits better fusion performance than the state-of-the-art fusion methods on public datasets. Interestingly, our network requires a fewer training parameters than other existing methods. The codes are available at https://github.com/hli1221/imagefusion-LRRNet

Hui Li, Tianyang Xu, Xiao-Jun Wu, Jiwen Lu, Josef Kittler• 2023

Related benchmarks

Task	Dataset	Result
Semantic segmentation	MSRS	mIoU73.4	120
Semantic segmentation	FMB (test)	mIoU68.8	110
Semantic segmentation	FMB	mIoU0.6294	67
Visible-Infrared Image Fusion	MSRS (test)	Average Gradient (AG)2.651	55
Infrared-Visible Image Fusion	RoadScene (test)	Visual Information Fidelity (VIF)0.58	53
Infrared-Visible Image Fusion	LLVIP (test)	EN6.67	48
Object Detection	M3FD	AP@[0.5:0.95]49.6	45
Infrared and Visible Image Fusion	RoadScene	Qabf0.39	42
Infrared-Visible Image Fusion	MSRS	QAB/F (Quality Assessment Block/Fusion)0.447	38
Object Detection	MSRS (test)	mAP@0.598.3	34

Showing 10 of 66 rows

Other info

Follow for update

@wizwand_team Discord