SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition

About

We introduce SensorLLM, a two-stage framework that enables Large Language Models (LLMs) to perform human activity recognition (HAR) from sensor time-series data. Despite their strong reasoning and generalization capabilities, LLMs remain underutilized for motion sensor data due to the lack of semantic context in time-series, computational constraints, and challenges in processing numerical inputs. SensorLLM addresses these limitations through a Sensor-Language Alignment stage, where the model aligns sensor inputs with trend descriptions. Special tokens are introduced to mark channel boundaries. This alignment enables LLMs to capture numerical variations, channel-specific features, and data of varying durations, without requiring human annotations. In the subsequent Task-Aware Tuning stage, we refine the model for HAR classification, achieving performance that matches or surpasses state-of-the-art methods. Our results demonstrate that SensorLLM evolves into an effective sensor learner, reasoner, and classifier through human-intuitive Sensor-Language Alignment, generalizing across diverse HAR datasets. We believe this work establishes a foundation for future research on time-series and text alignment, paving the way for foundation models in sensor data analysis. Our codes are available at https://github.com/zechenli03/SensorLLM.

Zechen Li, Shohreh Deldari, Linyao Chen, Hao Xue, Flora D. Salim• 2024

Related benchmarks

Task	Dataset	Result
Human Activity Recognition	UCI-HAR	Accuracy90.8	86
Human Activity Recognition	PAMAP2	Accuracy87.2	66
Activity Recognition	mHealth	F1 Score89.4	46
Human Activity Recognition	USC-HAD	Macro F161.2	33
Action Captioning	XRF IMU v2 (test)	BLEU@472.3	16
Action Captioning	UWash (test)	B@40.828	16
Action Captioning	XRF Wi-Fi v2 (test)	BLEU@40.392	15
Action Captioning	WiFiTAD (test)	B@444.2	15

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord