Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MTLSI-Net: A Linear Semantic Interaction Network for Parameter-Efficient Multi-Task Dense Prediction

About

Multi-task dense prediction aims to perform multiple pixel-level tasks simultaneously. However, capturing global cross-task interactions remains non-trivial due to the quadratic complexity of standard self-attention on high-resolution features. To address this limitation, we propose a Multi-Task Linear Semantic Interaction Network (MTLSI-Net), which facilitates cross-task interaction through linear attention. Specifically, MTLSI-Net incorporates three key components: a Multi-Task Multi-scale Query Linear Fusion Block, which captures cross-task dependencies across multiple scales with linear complexity using a shared global context matrix; a Semantic Token Distiller that compresses redundant features into compact semantic tokens, distilling essential cross-task knowledge; and a Cross-Window Integrated attention Block that injects global semantics into local features via a dual-branch architecture, preserving both global consistency and spatial precision. These components collectively enable the network to capture comprehensive cross-task interactions at linear complexity with reduced parameters. Extensive experiments on NYUDv2 and PASCAL-Context demonstrate that MTLSI-Net achieves state-of-the-art performance, validating its effectiveness and efficiency in multi-task learning.

Chen Liu, Hengyu Man, Xiaopeng Fan, Debin Zhao• 2026

Related benchmarks

TaskDatasetResultRank
Depth EstimationNYU V2
RMSE0.4904
167
Semantic segmentationNYUD v2
mIoU57.22
150
Surface Normal EstimationPascal Context
Mean Error (MAE)13.71
45
Saliency DetectionPascal Context
maxF Score84.52
45
Semantic segmentationPascal Context
mIoU80.86
42
Human ParsingPascal Context
mIoU69.9
35
Boundary DetectionNYUD v2
ODS F-measure78.6
30
Showing 7 of 7 rows

Other info

Follow for update