Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents

About

Earth observation (EO) is essential for understanding the evolving states of the Earth system. Although recent MLLMs have advanced EO research, they still lack the capability to tackle complex tasks that require multi-step reasoning and the use of domain-specific tools. Agent-based methods offer a promising direction, but current attempts remain in their infancy, confined to RGB perception, shallow reasoning, and lacking systematic evaluation protocols. To overcome these limitations, we introduce Earth-Agent, the first agentic framework that unifies RGB and spectral EO data within an MCP-based tool ecosystem, enabling cross-modal, multi-step, and quantitative spatiotemporal reasoning beyond pretrained MLLMs. Earth-Agent supports complex scientific tasks such as geophysical parameter retrieval and quantitative spatiotemporal analysis by dynamically invoking expert tools and models across modalities. To support comprehensive evaluation, we further propose Earth-Bench, a benchmark of 248 expert-curated tasks with 13,729 images, spanning spectrum, products and RGB modalities, and equipped with a dual-level evaluation protocol that assesses both reasoning trajectories and final outcomes. We conduct comprehensive experiments varying different LLM backbones, comparisons with general agent frameworks, and comparisons with MLLMs on remote sensing benchmarks, demonstrating both the effectiveness and potential of Earth-Agent. Earth-Agent establishes a new paradigm for EO analysis, moving the field toward scientifically grounded, next-generation applications of LLMs in Earth observation. More information about Earth-Agent can be found at https://github.com/opendatalab/Earth-Agent

Peilin Feng, Zhutao Lv, Junyan Ye, Xiaolei Wang, Xinjie Huo, Jinhua Yu, Wanghan Xu, Wenlong Zhang, Lei Bai, Conghui He, Weijia Li• 2025

Related benchmarks

Task	Dataset	Result
Image Classification	WHU-RS19	Accuracy96.12	70
Image Classification	AID	Accuracy93.42	66
Tool Use and Reasoning	Earth-Bench	Accuracy65.67	62
Object Detection	DOTA	mAP60.88	49
Visual Grounding	DIOR-RSVG	--	34
Object Detection	HRSC 2016	mAP65.6	23
Active Tool Exploration	Earth-Bench	Tokens per Question (AP)1.34e+5	9
Earth Observation Agent Task Completion	Earth-Bench Lite	Spectrum Score65	7
Geospatial Reasoning	TerraBench TerraScope-Bench	Accuracy37.6	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord