Towards Interpretable Visual Decoding with Attention to Brain Representations

About

Recent work has demonstrated that complex visual stimuli can be decoded from human brain activity using deep generative models, offering new ways to probe how the brain represents real-world scenes. However, many existing approaches first map brain signals into intermediate image or text feature spaces before guiding the generative process, which obscures the contributions of different brain areas to the final reconstruction output. In this work, we propose NeuroAdapter, a visual decoding framework that directly conditions a latent diffusion model on brain representations, bypassing the need for intermediate feature spaces. Our method demonstrates competitive visual reconstruction quality on public fMRI datasets compared to prior work, while providing greater transparency into how brain signals drive visual reconstruction. To this end, we introduce an Image-Brain BI-directional interpretability framework (IBBI) that analyzes cross-attention patterns across diffusion denoising steps to reveal how different cortical areas influence the unfolding generative trajectory. Our work highlights the potential of end-to-end brain-to-image reconstruction and establishes a path for interpretable neural decoding.

Pinyuan Feng, Hossein Adeli, Wenxuan Guo, Fan Cheng, Ethan Hwang, Nikolaus Kriegeskorte• 2025

Related benchmarks

Task	Dataset	Result
fMRI-to-image reconstruction	NSD 2 (test)	Inception Feature Similarity68.18	15
fMRI-to-image reconstruction	NSD	PixCorr12.4	9
Brain-to-Image Reconstruction	NSD-Imagery Mental Imagery Trials (test)	Pixel Correlation (PixCorr)0.037	6

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord