Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

JEPA-DNA: Grounding Genomic Foundation Models through Joint-Embedding Predictive Architectures

About

Genomic Foundation Models (GFMs) typically rely on Masked Language Modeling (MLM) or Next-Token Prediction (NTP) to learn the "Laws of Nature". While effective at capturing local syntax, these generative paradigms prioritize token-level reconstruction over high-level functional context. We introduce JEPA-DNA, a model-agnostic continual training framework that integrates a Joint-Embedding Predictive Architecture (JEPA) with traditional generative objectives. By supervising global sequence embeddings in a latent space, JEPA-DNA forces models to predict the functional representations of masked genomic segments, shifting the learning signal from token recovery to semantic alignment. We evaluate JEPA-DNA on 17 diverse genomic benchmark tasks, demonstrating consistent gains in linear probing and zero-shot performance regardless of the underlying GFM architecture or generative objective. Our framework establishes a new state-of-the-art for GFMs, surpassing the best existing models by bridging generative precision with latent semantic grounding. Through extensive ablation studies, we further characterize the synergistic interplay between generative and latent objectives. Our code is publicly available at https://github.com/NVIDIA-Digital-Bio/JEPA-DNA.

Ariel Larey, Elay Dahan, Amit Bleiweiss, Raizy Kellerman, Guy Leib, Omri Nayshool, Dan Ofer, Tal Zinger, Dan Dominissini, Gideon Rechavi, Nicole Bussola, Simon Lee, Shane O'Connell, Dung Hoang, Marissa Wirth, Alexander W. Charney, Nati Daniel, Yoli Shavit• 2026

Related benchmarks

TaskDatasetResultRank
Disease Variant PredictionBEND Disease Variant
AUROC0.512
2
Expression Effect PredictionBEND Expression Effect
AUROC0.524
2
Variant Effect PredictionTraitGym Mendelian
AUROC0.544
2
Variant Effect PredictionSonglab ClinVar
AUROC0.544
2
Pathogenicity PredictionLRB Pathogenic OMIM
AUROC0.452
2
Variant Effect PredictionTraitGym Complex
AUROC49.1
2
Showing 6 of 6 rows

Other info

Follow for update