MambaGaze: Bidirectional Mamba with Explicit Missing Data Modeling for Cognitive Load Assessment from Eye-Gaze Tracking Data

About

Real-time cognitive load assessment from eye-tracking signals could enable adaptive human-centered AI in safety-critical applications such as driver vigilance monitoring or automated flight deck assistance, yet two challenges persist: handling frequent data missingness from blinks and tracking failures, and efficiently modeling long-range temporal dependencies. We propose MambaGaze (Bi-Mamba), a framework that addresses these challenges through (1)~XMD encoding, which augments raw features with observation masks and time-deltas to explicitly model data uncertainty, and (2)~bidirectional Mamba-2, which captures temporal dependencies with linear computational complexity. Experiments on CLARE and CL-Drive datasets under leave-one-subject-out evaluation show that MambaGaze achieves 77.1\% accuracy and 59.2\% macro-F1 on CLARE, and 69.4\% accuracy and 51.5\% macro-F1 on CL-Drive, attaining the highest average LOSO macro-F1 (55.3\%) across all ten compared models. Input-stream ablation indicates that log-scaled time-deltas are the strongest single channel in our setting, and combining all three XMD streams provides consistent gains of 5--20\,pp macro-F1. Edge deployment benchmarks on three NVIDIA Jetson Orin platforms show real-time inference at 27--36\,FPS with power consumption below 6.6\,W, supporting feasibility for embedded cognitive load monitoring.

Amir Mousavi, Mohammad Sadegh Sirjani, Erfan Nourbakhsh, Mimi Xie, Rocky Slavin, Leslie Neely, John Davis, John Quarles• 2026

Related benchmarks

Task	Dataset	Result
Cognitive Load Classification	CLARE (LOSO)	Accuracy76.8	8
Cognitive Load Classification	CL-Drive (LOSO)	Accuracy (LOSO)73.1	8
Cognitive Load Classification	CLARE (K-Fold)	Accuracy73.3	8
Cognitive Load Classification	CL-Drive (K-Fold)	Accuracy68.8	8

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord