RNA-FM: Flow-Matching Generative Model for Genome-wide RNA-Seq Prediction
About
Histopathology whole-slide images (WSIs) are routinely acquired in clinical practice and contain rich tissue morphology but lack direct molecular architecture and functional programs defining pathological states, whereas RNA sequencing (RNA-seq) provides genome-wide transcriptional profiles at substantial cost, thereby motivating WSI-based genome-wide transcriptomic prediction. Existing approaches for predicting gene expression from WSIs predominantly rely on deterministic regression with one-to-one mapping, limiting their ability to capture biological heterogeneity and predictive uncertainty. We propose RNA-FM, a flow-matching generative framework for genome-wide bulk RNA-seq prediction from WSIs. RNA-FM formulates transcriptomic prediction as a continuous-time conditional transport problem, learning a velocity field that maps a simple prior to the target gene expression distribution conditioned on morphologies. By integrating pathway-level structure, RNA-FM enables scalable and biologically interpretable genome-wide gene expression imputation. Extensive experiments demonstrate that RNA-FM consistently outperforms state-of-the-art approaches while maintaining biological meaningfulness. Code is available at https://github.com/YXSong000/RNA-FM.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Genome-wide RNA-Seq Prediction | TCGA-LUAD T200 | PCC0.798 | 6 | |
| Genome-wide RNA-Seq Prediction | TCGA-LUAD T20 | PCC0.897 | 6 | |
| Genome-wide RNA-Seq Prediction | TCGA-BRCA T200 | PCC0.798 | 6 | |
| Genome-wide RNA-Seq Prediction | TCGA-BRCA T20 | PCC0.908 | 6 | |
| Genome-wide RNA-Seq Prediction | TCGA-COAD T200 | PCC0.888 | 6 | |
| Genome-wide RNA-Seq Prediction | TCGA-COAD T20 | PCC0.964 | 6 | |
| Genome-wide RNA-Seq Prediction | TCGA-LUAD | PCC (T1000)0.729 | 6 | |
| Genome-wide RNA-Seq Prediction | TCGA-BRCA | PCC (T1000)0.744 | 6 | |
| Genome-wide RNA-Seq Prediction | TCGA-COAD | PCC (T1000)0.775 | 6 | |
| Genome-wide RNA-Seq Prediction | CPTAC-LUAD T200 | PCC0.454 | 3 |