Tokenizing Single-Channel EEG with Time-Frequency Motif Learning
About
Foundation models are reshaping EEG analysis, yet an important problem of EEG tokenization remains a challenge. This paper presents TFM-Tokenizer, a novel tokenization framework that learns a vocabulary of time-frequency motifs from single-channel EEG signals and encodes them into discrete tokens. We propose a dual-path architecture with time-frequency masking to capture robust motif representations, and it is model-agnostic, supporting both lightweight transformers and existing foundation models for downstream tasks. Our study demonstrates three key benefits: Accuracy: Experiments on four diverse EEG benchmarks demonstrate consistent performance gains across both single- and multi-dataset pretraining settings, achieving up to $11\%$ improvement in Cohen's Kappa over strong baselines. Generalization: Moreover, as a plug-and-play component, it consistently boosts the performance of diverse foundation models, including BIOT and LaBraM. Scalability: By operating at the single-channel level rather than relying on the strict 10-20 EEG system, our method has the potential to be device-agnostic. Experiments on ear-EEG sleep staging, which differs from the pretraining data in signal format, channel configuration, recording device, and task, show that our tokenizer outperforms baselines by $14\%$. A comprehensive token analysis reveals strong class-discriminative, frequency-aware, and consistent structure, enabling improved representation quality and interpretability. Code is available at https://github.com/Jathurshan0330/TFM-Tokenizer.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Event Type Classification | TUEV | Balanced Accuracy59.74 | 50 | |
| Seizure Detection | CHB-MIT | Balanced Accuracy0.675 | 34 | |
| Abnormality Detection | TUAB | Balanced Accuracy81.52 | 27 | |
| Emotion Recognition | SEED v1 (cross-subject) | Cohen's κ29.9 | 24 | |
| Emotion Recognition | SEED VII | Balanced Accuracy0.22 | 21 | |
| Emotion Recognition | SEED | Accuracy (SEED)53.3 | 20 | |
| Emotion Recognition | SEED V | Accuracy28.8 | 16 | |
| seizure type classification | IIIC Seizure | Balanced Accuracy57.75 | 14 | |
| Emotion Recognition | SEED-IV v1 (cross-subject) | Cohen's Kappa11.2 | 12 | |
| Emotion Recognition | SEED IV | Accuracy34 | 12 |