Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Open-Ended Question Answering on Earth Observation
Loading...
97.05
Judge Score
Qwen3
90.758
92.3915
94.025
95.6585
Mar 20, 2026
Judge Score
EVE WR
Rank
Updated 3d ago
Evaluation Results
Method
Method
Links
Judge Score
EVE WR
Rank
Qwen3
Size (B)=235-A22
2026.03
97.05
48.24
2.17
GPT-4.1
Size (B)=1800*
2026.03
96.48
49.77
2.83
Mistral Medium 3.1
Size (B)=200*
2026.03
96.45
50.2
4.17
EVE-Instruct
Size (B)=24
2026.03
96.4
-
3.5
GPT OSS
Size (B)=120A5
2026.03
94.2
50.3
4.83
GPT-5 nano
Size (B)=20*
2026.03
92.2
50.2
5.33
MiniMax m2.5
Size (B)=230A10
2026.03
91
51.1
5.17
Feedback
Search any
task
Search any
task