PRIMUS: Pretraining IMU Encoders with Multimodal Self-Supervision

About

Sensing human motions through Inertial Measurement Units (IMUs) embedded in personal devices has enabled significant applications in health and wellness. Labeled IMU data is scarce, however, unlabeled or weakly labeled IMU data can be used to model human motions. For video or text modalities, the "pretrain and adapt" approach utilizes large volumes of unlabeled or weakly labeled data to build a strong feature extractor, followed by adaptation to specific tasks using limited labeled data. However, pretraining methods are poorly understood for IMU data, and pipelines are rarely evaluated on out-of-domain tasks. We propose PRIMUS: a method for PRetraining IMU encoderS that uses a novel pretraining objective that is empirically validated based on downstream performance on both in-domain and out-of-domain datasets. The PRIMUS objective effectively enhances downstream performance by combining self-supervision, multimodal, and nearest-neighbor supervision. With fewer than 500 labeled samples per class, PRIMUS improves test accuracy by up to 15%, compared to state-of-the-art baselines. To benefit the broader community, we have open-sourced our code at github.com/nokia-bell-labs/pretrained-imu-encoders.

Arnav M. Das, Chi Ian Tang, Fahim Kawsar, Mohammad Malekzadeh• 2024

Related benchmarks

Task	Dataset	Result
Human Activity Recognition	TotalCapture	Accuracy67	16
Human Activity Recognition	MRI	Accuracy82	16
Egocentric Text Retrieval	Ego-Exo4D	Physical iv Top-1 Accuracy18.5	8
Action Recognition	Opportunity++ 10 (test)	F1 (Weighted)0.28	5
Action Recognition	HWU-USP 39 (test)	F1 (Weighted)48	5

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord