Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction

About

Reconstructing dynamic visual experiences as videos from functional magnetic resonance imaging (fMRI) is pivotal for advancing the understanding of neural processes. However, current fMRI-to-video reconstruction methods are hindered by a semantic gap between noisy fMRI signals and the rich content of videos, stemming from a reliance on incomplete semantic embeddings that neither capture video-specific cues (e.g., actions) nor integrate prior knowledge. To this end, we draw inspiration from the dual-pathway processing mechanism in human brain and introduce CineNeuron, a novel hierarchical framework for semantically enhanced video reconstruction from fMRI signals with two synergistic stages. First, a bottom-up semantic enrichment stage maps fMRI signals to a rich embedding space that comprehensively captures textual semantics, image contents, action concepts, and object categories. Second, a top-down memory integration stage utilizes the proposed Mixture-of-Memories method to dynamically select relevant "memories" from previously seen data and fuse them with the fMRI embedding to refine the video reconstruction. Extensive experimental results on two fMRI-to-video benchmarks demonstrate that CineNeuron surpasses state-of-the-art methods across various metrics.

Yujie Wei, Chenglong Ma, Jianxiong Gao, Chenhui Wang, Shiwei Zhang, Biao Gong, Shuai Tan, Hangjie Yuan, Hongming Shan• 2026

Related benchmarks

Task	Dataset	Result
fMRI-to-Video Reconstruction	cc 2017	2-way Accuracy85	5
fMRI-to-Video Reconstruction	CineBrain	2-way Accuracy93.7	4
Video Reconstruction from fMRI	cc and CineBrain 2017	Semantic Alignment63.77	4
Brain-to-video reconstruction and retrieval	cc OOD 2017	Acc282.1	3
fMRI-to-Video Reconstruction	cc 2017 (test)	EPE1.628	3
fMRI-to-image Retrieval	cc 2017 (test)	Top-1 Retrieval Accuracy28.3	2
fMRI-to-Video Reconstruction	CineBrain (test)	EPE2.126	2
image-to-fMRI Retrieval	cc 2017 (test)	Top-1 Retrieval Accuracy26.2	2
Video Reconstruction from fMRI	BOLDMoments	Accuracy (k=2)79.1	2

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord