CAMEL-CLIP: Channel-aware Multimodal Electroencephalography-text Alignment for Generalizable Brain Foundation Models
About
Electroencephalography (EEG) foundation models have shown promise for learning generalizable representations, yet they remain sensitive to channel heterogeneity, such as changes in channel composition or ordering. We propose channel-aware multimodal EEG-text alignment contrastive language-image pretraining (CAMEL-CLIP), a contrastive EEG-text multimodal foundation model designed to be robust to heterogeneous channel configurations and widely applicable to diverse downstream tasks. CAMEL-CLIP introduces three key components: (1) channel attribute-based positional encoding, which identifies channels through semantic information; (2) dynamic channel projection, which generates variable-length embeddings by independently projecting each channel without feature compression; and (3) dual-level contrastive learning, which jointly performs channel-level and sample-level contrastive learning to capture both channel-specific and global signal characteristics. Experimental results demonstrate that CAMEL-CLIP achieves state-of-the-art performance under linear-probing and outperforms existing foundation models that rely on full-finetuning.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Major Depressive Disorder Classification | Mumtaz 2016 | Balanced Accuracy94.39 | 8 | |
| Retrieval | TUAB | Pathological Retrieval Score81.82 | 8 | |
| Seizure Detection | CHB-MIT | Balanced Accuracy82.66 | 8 | |
| Age Classification | TUAB | Accuracy73.48 | 2 | |
| Age Classification | TUAB (test) | Accuracy67.51 | 2 | |
| Gender Classification | TUAB (test) | Accuracy64.37 | 2 | |
| Pathological Classification | TUAB | Accuracy83.56 | 2 | |
| Pathological Classification | TUAB (test) | Accuracy56.4 | 2 | |
| Text-based classification | TUAB | Pathological Score81.58 | 2 |