Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Cancer Modeling in the Age of Foundation Model Embeddings

About

The Cancer Genome Atlas (TCGA) has enabled novel discoveries and served as a large-scale reference dataset in cancer through its harmonized genomics, clinical, and imaging data. Numerous prior studies have developed bespoke deep learning models over TCGA for tasks such as cancer survival prediction. A modern paradigm in biomedical deep learning is the development of foundation models (FMs) to derive feature embeddings agnostic to a specific modeling task. Biomedical text especially has seen growing development of FMs. While TCGA contains free-text data as pathology reports, these have been historically underutilized. Here, we investigate the ability to train classical machine learning models over multimodal, zero-shot FM embeddings of cancer data. We demonstrate the ease and additive effect of multimodal fusion, outperforming unimodal models. Further, we show the benefit of including pathology report text and rigorously evaluate the effect of model-based text summarization and hallucination. Overall, we propose an embedding-centric approach to multimodal cancer modeling.

Steven Song, Morgan Borjigin-Wang, Irene Madejski, Robert L. Grossman• 2025

Related benchmarks

TaskDatasetResultRank
Survival PredictionTCGA (cross-validated)
C-index0.795
29
3-year mortality classificationFive TCGA cohorts (UCEC, LUAD, LGG, BRCA, BLCA) (average across cohorts)
AUROC67.2
19
3-year recurrence classificationFive TCGA cohorts (UCEC, LUAD, LGG, BRCA, BLCA) (average across cohorts)
AUROC62.6
19
Showing 3 of 3 rows

Other info

Follow for update