LEAF: Language-EEG Aligned Foundation Model for Brain-Computer Interfaces
About
Recent advances in electroencephalography (EEG) foundation models, which capture transferable EEG representations, have greatly accelerated the development of brain-computer interfaces (BCIs). However, existing approaches still struggle to incorporate language instructions as prior constraints for EEG representation learning, limiting their ability to leverage the semantic knowledge inherent in language to unify different labels and tasks. To address this challenge, we present LEAF, a foundation model for EEG--Language Alignment with Semantic Task Instruction and Querying. LEAF integrates task-aware semantic guidance to produce structured and linguistically aligned EEG embeddings, thereby enhancing decoding robustness and transferability. In the pretraining stage, we introduce a joint Spectral--Temporal Reconstruction (STR) framework that captures the coupled spectral rhythms and temporal dynamics of EEG signals. STR applies randomized spectral perturbation to enhance frequency robustness and uses two complementary temporal objectives to learn both contextual and sequential structure. In the EEG-Language alignment stage, we propose the Instruction-conditioned Q-Former (IQF). This query-based cross-attention transformer injects instruction embeddings into EEG tokens and achieves semantic alignment with textual label embeddings through learnable queries. We evaluate LEAF on 16 downstream datasets spanning motor imagery, emotion recognition, steady-state visual evoked potentials, covert speech, and healthcare tasks. LEAF achieves state-of-the-art performance on 12 of the 16 datasets and obtains the best average results across all five task categories. Importantly, our analyses reveal for the first time that explicit task instructions serve as semantic priors guiding EEG embeddings into coherent and linguistically grounded spaces. The code and pre-trained weights will be released.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Motor Imagery Classification | SHU-MI | Balanced Accuracy61.34 | 22 | |
| Covert Speech Recognition | BCIC-Speech | B.Acc54.53 | 9 | |
| Emotion Recognition | FACED | Balanced Accuracy58.19 | 9 | |
| Emotion Recognition | SEED IV | Balanced Accuracy46.3 | 9 | |
| Emotion Recognition | SEED V | Balanced Accuracy41.26 | 9 | |
| Emotion Recognition | SEED VII | Balanced Accuracy33.56 | 9 | |
| Mental Workload Classification | Mental Workload | B.Acc64.93 | 9 | |
| Motor Imagery | HighGamma | Balanced Accuracy79.82 | 9 | |
| Motor Imagery | Cho 2017 | Balanced Accuracy79.08 | 9 | |
| Motor Imagery | Shin A 2017 | Balanced Accuracy72.56 | 9 |