$\mathcal{B}^{3}$-Net: Controlled Posterior Bridge Learning for Multi-Task Dense Prediction

About

Multi-task dense prediction solves complementary pixel-level tasks in a unified model, such as semantic segmentation, depth estimation, surface normal estimation, and edge detection. Existing decoder-side interactions use attention, prompts, routing, diffusion, Mamba, or bridge features to exchange task evidence, but most of them organize this evidence implicitly. They usually fuse task features by similarity or affinity, without explicitly modeling that evidence reliability varies across tasks and spatial locations. As a result, unreliable evidence may contaminate the shared representation and intensify negative transfer. We propose $\mathcal{B}^{3}$-Net, a controlled posterior bridge learning framework for multi-task dense prediction. Our method decomposes decoder-side interaction into reliability estimation, posterior bridge construction, and bounded redistribution. The Precision Field Estimator estimates patch-wise evidence precision from task-reference alignment and local variation. The Posterior Bridge Operator builds a precision-weighted posterior bridge through heteroscedastic evidence fusion, yielding a shared state more reliable than uniform or heuristic mixtures. The Contractive Dispatch Operator redistributes the bridge to each task branch through a bounded update, reducing uncontrolled feature injection. Experiments on NYUD-v2, PASCAL-Context, and Cityscapes show that $\mathcal{B}^{3}$-Net achieves competitive or superior trade-offs over representative CNN-, Transformer-, diffusion-, Mamba-, and bridge-feature-based methods. Backbone-matched comparisons and extensive analyses further verify that the gains arise from controlled posterior bridge learning rather than backbone capacity or decoder scale.

Meihua Zhou, Li Yang• 2026

Related benchmarks

Task	Dataset	Result
Semantic segmentation	Cityscapes	mIoU93.95	526
Depth Estimation	NYU V2	RMSE0.4587	207
Semantic segmentation	NYUD v2	mIoU57.78	169
Saliency Detection	Pascal Context	maxF Score86.11	64
Surface Normal Estimation	Pascal Context	Mean Error (MAE)13.4	64
Semantic segmentation	Pascal Context	mIoU80.81	61
Human Parsing	Pascal Context	mIoU73.73	54
Surface Normal Estimation	NYUD	mErr17.22	38
Edge Detection	NYUD v2	ODS83.18	33
Edge Detection	Pascal Context	ODS F-score81.18	17

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord