EVE: A Domain-Specific LLM Framework for Earth Intelligence

About

We introduce Earth Virtual Expert (EVE), the first open-source, end-to-end initiative for developing and deploying domain-specialized LLMs for Earth Intelligence. At its core is EVE-Instruct, a domain-adapted 24B model built on Mistral Small 3.2 and optimized for reasoning and question answering. On newly constructed Earth Observation and Earth Sciences benchmarks, it outperforms comparable models while preserving general capabilities. We release curated training corpora and the first systematic domain-specific evaluation benchmarks, covering MCQA, open-ended QA, and factuality. EVE further integrates RAG and a hallucination-detection pipeline into a production system deployed via API and GUI, supporting 350 pilot users so far. All models, datasets, and code are ready to be released under open licenses as contributions to our field at huggingface.co/eve-esa and github.com/eve-esa.

\`Alex R. Atrio, Antonio Lopez, Jino Rohit, Yassine El Ouahidi, Marcello Politi, Vijayasri Iyer, Umar Jamil, S\'ebastien Brati\`eres, Nicolas Long\'ep\'e• 2026

Related benchmarks

Task	Dataset	Result
Multiple Choice Question Answering (Single)	Earth Observation	Accuracy96.35	7
Hallucination Detection	Earth Observation	F1 Score84.7	7
Multiple Choice Question Answering (Multiple)	Earth Observation	IoU86.12	7
Open-ended Question Answering	Earth Observation	Judge Score96.4	7
Open-Ended Question Answering (with Context)	Earth Observation	Judge Score78.28	7
Hallucination Detection	EO and Earth Sciences Hallucination	F1 Score84.7	5
Multiple-choice Question Answering	EO and Earth Sciences MCQA Multiple	IoU86.12	5
Multiple-choice Question Answering	EO and Earth Sciences MCQA Single	Accuracy96.35	5
Open-ended Question Answering	EO and Earth Sciences Open-Ended QA	Judge Score96.4	5
Overall Performance Ranking	EO and Earth Sciences Combined Benchmarks	Rank1.33	5

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord