Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection

About

The rapid development of deep learning has significantly improved salient object detection (SOD) combining both RGB and thermal (RGB-T) images. However, existing Transformer-based RGB-T SOD models with quadratic complexity are memory-intensive, limiting their application in high-resolution bimodal feature fusion. To overcome this limitation, we propose a purely Fourier Transform-based model, namely Deep Fourier-embedded Network (FreqSal), for accurate RGB-T SOD. Specifically, we leverage the efficiency of Fast Fourier Transform with linear complexity to design three key components: (1) To fuse RGB and thermal modalities, we propose Modal-coordinated Perception Attention, which aligns and enhances bimodal Fourier representation in multiple dimensions; (2) To clarify object edges and suppress noise, we design Frequency-decomposed Edge-aware Block, which deeply decomposes and filters Fourier components of low-level features; (3) To accurately decode features, we propose Fourier Residual Channel Attention Block, which prioritizes high-frequency information while aligning channel-wise global relationships. Additionally, even when converged, existing deep learning-based SOD models' predictions still exhibit frequency gaps relative to ground-truth. To address this problem, we propose Co-focus Frequency Loss, which dynamically weights hard frequencies during edge frequency reconstruction by cross-referencing bimodal edge information in the Fourier domain. Extensive experiments on ten bimodal SOD benchmark datasets demonstrate that FreqSal outperforms twenty-nine existing state-of-the-art bimodal SOD models. Comprehensive ablation studies further validate the value and effectiveness of our newly proposed components. The code is available at https://github.com/JoshuaLPF/FreqSal.

Pengfei Lyu, Xiaosheng Yu, Pak-Hei Yeung, Chengdong Wu, Jagath C. Rajapakse• 2024

Related benchmarks

TaskDatasetResultRank
RGB-T Salient Object DetectionVT1000
S-Measure (S)94.8
42
RGB-T Salient Object DetectionVT821
S Score0.917
42
RGB-T Salient Object DetectionVT1000 (test)
S-Measure94.8
39
RGB-T Salient Object DetectionVT821 (test)
Sm0.917
39
RGB-T Salient Object DetectionVT5000 (test)
Sm Score92
39
RGB-T Salient Object DetectionVT5000
Score (M)2.3
28
Showing 6 of 6 rows

Other info

Follow for update