Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TerraMind: Large-Scale Generative Multimodality for Earth Observation

About

We present TerraMind, the first any-to-any generative, multimodal foundation model for Earth observation (EO). Unlike other multimodal models, TerraMind is pretrained on dual-scale representations combining both token-level and pixel-level data across modalities. On a token level, TerraMind encodes high-level contextual information to learn cross-modal relationships, while on a pixel level, TerraMind leverages fine-grained representations to capture critical spatial nuances. We pretrained TerraMind on nine geospatial modalities of a global, large-scale dataset. In this paper, we demonstrate that (i) TerraMind's dual-scale early fusion approach unlocks a range of zero-shot and few-shot applications for Earth observation, (ii) TerraMind introduces "Thinking-in-Modalities" (TiM) -- the capability of generating additional artificial data during finetuning and inference to improve the model output -- and (iii) TerraMind achieves beyond state-of-the-art performance in community-standard benchmarks for EO like PANGAEA. The pretraining dataset, the model weights, and our code are open-sourced under a permissive license.

Johannes Jakubik, Felix Yang, Benedikt Blumenstiel, Erik Scheurer, Rocco Sedona, Stefano Maurogiovanni, Jente Bosmans, Nikolaos Dionelis, Valerio Marsocci, Niklas Kopp, Rahul Ramachandran, Paolo Fraccaro, Thomas Brunschwiler, Gabriele Cavallaro, Juan Bernabe-Moreno, Nicolas Long\'ep\'e• 2025

Related benchmarks

TaskDatasetResultRank
Semantic segmentationSen1Floods11
mIoU (macro)88.42
29
Semantic segmentationMADOS
mIoU67.44
26
Pixel-wise classificationDominant Leaf Type Area of interest A+
IoU80
26
Semantic segmentationHLS Burn Scars
mIoU82.93
25
Semantic segmentationPASTIS
Macro mIoU41.53
24
Semantic segmentationSN-7-TS (test)
mIoU60.61
24
Semantic segmentationMADOS (test)
mIoU0.6952
19
Semantic segmentationPangaea Aggregate (test)
Average Rank3.56
19
Semantic segmentationPASTIS (test)
mIoU40.51
19
Semantic segmentationDynamicEarthNet (DEN)
mIoU38.46
19
Showing 10 of 60 rows

Other info

Follow for update