Building Blocks for a Complex-Valued Transformer Architecture

About

Most deep learning pipelines are built on real-valued operations to deal with real-valued inputs such as images, speech or music signals. However, a lot of applications naturally make use of complex-valued signals or images, such as MRI or remote sensing. Additionally the Fourier transform of signals is complex-valued and has numerous applications. We aim to make deep learning directly applicable to these complex-valued signals without using projections into $\mathbb{R}^2$. Thus we add to the recent developments of complex-valued neural networks by presenting building blocks to transfer the transformer architecture to the complex domain. We present multiple versions of a complex-valued Scaled Dot-Product Attention mechanism as well as a complex-valued layer normalization. We test on a classification and a sequence generation task on the MusicNet dataset and show improved robustness to overfitting while maintaining on-par performance when compared to the real-valued transformer architecture.

Florian Eilers, Xiaoyi Jiang• 2023

Related benchmarks

Task	Dataset	Result
Modulation Classification	RadioML RML2016 mirror (test)	L1 Error0.244	6
Music Modeling	Real MusicNet	L1 Error0.201	6
Image Classification	FFT-MNIST	Accuracy39	6
Long Range Arena ListOps	LRA-ListOps small	Accuracy (LRA-ListOps small)63.7	6
Pitch Estimation	multi-pitch	Accuracy82	6
Radio Modulation Classification	RadioML L2	Accuracy27	6
Copying Task	Copy d=500	Accuracy10	6
Copying Task	Copy d=2000	Accuracy8	6
Logical operations parsing	ListOps mid L1024	Accuracy10.4	6
Memory retention task	phase-memory	Accuracy93	6

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord