Fine-grained Context and Multi-modal Alignment for Freehand 3D Ultrasound Reconstruction

About

Fine-grained spatio-temporal learning is crucial for freehand 3D ultrasound reconstruction. Previous works mainly resorted to the coarse-grained spatial features and the separated temporal dependency learning and struggles for fine-grained spatio-temporal learning. Mining spatio-temporal information in fine-grained scales is extremely challenging due to learning difficulties in long-range dependencies. In this context, we propose a novel method to exploit the long-range dependency management capabilities of the state space model (SSM) to address the above challenge. Our contribution is three-fold. First, we propose ReMamba, which mines multi-scale spatio-temporal information by devising a multi-directional SSM. Second, we propose an adaptive fusion strategy that introduces multiple inertial measurement units as auxiliary temporal information to enhance spatio-temporal perception. Last, we design an online alignment strategy that encodes the temporal information as pseudo labels for multi-modal alignment to further improve reconstruction performance. Extensive experimental validations on two large-scale datasets show remarkable improvement from our method over competitors.

Zhongnuo Yan, Xin Yang, Mingyuan Luo, Jiongquan Chen, Rusi Chen, Lian Liu, Dong Ni• 2024

Related benchmarks

Task	Dataset	Result	Rank
Freehand 3D Ultrasound Reconstruction	Arm dataset (test)	FDR (%)9.72		11
Freehand 3D Ultrasound Reconstruction	Carotid dataset (test)	FDR (%)8.61		11

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord