Standing on the Shoulders of Giants: Rethinking EEG Foundation Model Pretraining via Multi-Teacher Distillation

About

Pretraining for electroencephalogram (EEG) foundation models has predominantly relied on self-supervised masked reconstruction, a paradigm largely adapted from and inspired by the success of vision and language foundation models. However, unlike images and text, EEG datasets are notoriously expensive to collect and characterized by low signal-to-noise ratio. These challenges introduce difficulties in scaling the EEG foundation models and capturing the underlying neural semantics through reconstruction. In this work, we ask the question: can we stand on the shoulders of well-established foundation models from well-represented modalities to bootstrap the pretraining of EEG foundation models? We first demonstrate that mainstream foundation models, such as those from vision and time series, transfer surprisingly well to EEG domain. To this end, we propose the Multi-Teacher Distillation Pretraining (MTDP) framework for pretraining EEG foundation models via a two-stage multi-teacher distillation. In the first stage, we introduce a learnable gating network to fuse representations from diverse teachers (e.g., DINOv3 and Chronos) via a masked latent denoising objective. In the second stage, we distill the fused representation into an EEG foundation model. Extensive evaluations across 9 downstream tasks and 12 datasets demonstrate that our MTDP-based EEG foundation model outperforms its self-supervised counterparts while requiring only 25% of the pretraining data.

Chenqi Li, Yu Liu, Shuo Zhang, Timothy Denison, Tingting Zhu• 2026

Related benchmarks

Task	Dataset	Result
Binary classification of normal versus abnormal EEG signals	TUAB	Balanced Accuracy81.02	113
EEG Classification	CHB-MIT	B-ACC80.13	30
Motor Imagery Classification	PhysioNet-MI	Balanced Accuracy64.57	27
Motor Imagery Classification	SHU-MI	Balanced Accuracy63.78	22
EEG Classification	BCIC 3 2020	Balanced Accuracy62.53	20
EEG Classification	MentalArithmetic	Balanced Accuracy77.43	18
Sleep Staging	ISRUC (test)	Accuracy79.41	14
EEG Classification	FACED	Binary Accuracy56.95	13
EEG Classification	Mumtaz 2016	Balanced Accuracy95.85	13
Motor Imagery Classification	BCIC 2a IV	Balanced Accuracy59.81	13

Showing 10 of 23 rows

Other info

Follow for update

@wizwand_team Discord