A Multi-scale Linear-time Encoder for Whole-Slide Image Analysis

About

We introduce Multi-scale Adaptive Recurrent Biomedical Linear-time Encoder (MARBLE), the first \textit{purely Mamba-based} multi-state multiple instance learning (MIL) framework for whole-slide image (WSI) analysis. MARBLE processes multiple magnification levels in parallel and integrates coarse-to-fine reasoning within a linear-time state-space model, efficiently capturing cross-scale dependencies with minimal parameter overhead. WSI analysis remains challenging due to gigapixel resolutions and hierarchical magnifications, while existing MIL methods typically operate at a single scale and transformer-based approaches suffer from quadratic attention costs. By coupling parallel multi-scale processing with linear-time sequence modeling, MARBLE provides a scalable and modular alternative to attention-based architectures. Experiments on five public datasets show improvements of up to \textbf{6.9\%} in AUC, \textbf{20.3\%} in accuracy, and \textbf{2.3\%} in C-index, establishing MARBLE as an efficient and generalizable framework for multi-scale WSI analysis.

Jagan Mohan Reddy Dwarampudi, Joshua Wong, Hien Van Nguyen, Tania Banerjee• 2026

Related benchmarks

Task	Dataset	Result
Slide-level classification	TCGA NSCLC (test)	Accuracy89.66	96
Survival Analysis	TCGA-LUAD (test)	C-index0.6432	40
Survival Prediction	TCGA-STAD (test)	C-index0.651	24
Classification	PANDA (test)	Accuracy71	19
Survival Analysis	TCGA KIRP (test)	C-Index0.8184	18

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord