Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding

About

Decoding visual stimuli from brain recordings aims to deepen our understanding of the human visual system and build a solid foundation for bridging human and computer vision through the Brain-Computer Interface. However, reconstructing high-quality images with correct semantics from brain recordings is a challenging problem due to the complex underlying representations of brain signals and the scarcity of data annotations. In this work, we present MinD-Vis: Sparse Masked Brain Modeling with Double-Conditioned Latent Diffusion Model for Human Vision Decoding. Firstly, we learn an effective self-supervised representation of fMRI data using mask modeling in a large latent space inspired by the sparse coding of information in the primary visual cortex. Then by augmenting a latent diffusion model with double-conditioning, we show that MinD-Vis can reconstruct highly plausible images with semantically matching details from brain recordings using very few paired annotations. We benchmarked our model qualitatively and quantitatively; the experimental results indicate that our method outperformed state-of-the-art in both semantic mapping (100-way semantic classification) and generation quality (FID) by 66% and 41% respectively. An exhaustive ablation study was also conducted to analyze our framework.

Zijiao Chen, Jiaxin Qing, Tiange Xiang, Wan Lin Yue, Juan Helen Zhou• 2022

Related benchmarks

Task	Dataset	Result
fMRI-to-image reconstruction	NSD 2 (test)	Inception Feature Similarity78.8	15
fMRI Decoding	NSD (Natural Scenes Dataset) shared (test)	Pixel Correlation0.067	11
fMRI-to-image reconstruction	BOLD5000 (test)	Pixel Correlation (PixCorr)21.22	9
fMRI-to-image reconstruction	NSD (test)	PixCorr0.2736	9
fMRI-to-image reconstruction	GOD (test)	PixCorr19.21	9
Brain-to-image retrieval	NSD (test)	Accuracy85.9	5
Image-to-Brain Retrieval	NSD (test)	Accuracy91.6	5
Image Question Answering	NSD (test)	Accuracy50.37	5

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord