CAMEL-CLIP: Channel-aware Multimodal Electroencephalography-text Alignment for Generalizable Brain Foundation Models

About

Electroencephalography (EEG) foundation models have shown promise for learning generalizable representations, yet they remain sensitive to channel heterogeneity, such as changes in channel composition or ordering. We propose channel-aware multimodal EEG-text alignment contrastive language-image pretraining (CAMEL-CLIP), a contrastive EEG-text multimodal foundation model designed to be robust to heterogeneous channel configurations and widely applicable to diverse downstream tasks. CAMEL-CLIP introduces three key components: (1) channel attribute-based positional encoding, which identifies channels through semantic information; (2) dynamic channel projection, which generates variable-length embeddings by independently projecting each channel without feature compression; and (3) dual-level contrastive learning, which jointly performs channel-level and sample-level contrastive learning to capture both channel-specific and global signal characteristics. Experimental results demonstrate that CAMEL-CLIP achieves state-of-the-art performance under linear-probing and outperforms existing foundation models that rely on full-finetuning.

Hanseul Choi, Jinyeong Park, Seongwon Jin, Sungho Park, Jibum Kim• 2026

Related benchmarks

Task	Dataset	Result
Seizure Detection	CHB-MIT	Balanced Accuracy82.66	34
Major Depressive Disorder Classification	Mumtaz 2016	Balanced Accuracy94.39	8
Retrieval	TUAB	Pathological Retrieval Score81.82	8
Age Classification	TUAB	Accuracy73.48	2
Age Classification	TUAB (test)	Accuracy67.51	2
Gender Classification	TUAB (test)	Accuracy64.37	2
Pathological Classification	TUAB	Accuracy83.56	2
Pathological Classification	TUAB (test)	Accuracy56.4	2
Text-based classification	TUAB	Pathological Score81.58	2

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord