Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Earth Observation agent reasoning on Earth-Agent Benchmark subset-65
Loading...
58.07
Tool-A-O
AFlow
31.2484
38.2117
45.175
52.1383
Jan 30, 2026
Tool-A-O
Tool-I-O
Tool-E-M
Efficiency
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Tool-A-O
Tool-I-O
Tool-E-M
Efficiency
Accuracy
AFlow
2026.01
58.07
25.91
22.38
0.89
30.16
GeoEvolver
2026.01
57.66
44.66
39.06
1.47
76.56
Training-free GRPO
2026.01
57.24
44.36
36.44
1.36
31.25
Deepagents
2026.01
41.67
33.98
25.45
1.06
29.69
Expel
2026.01
32.72
25.94
22.48
1.79
22.58
Earth-Agent-MAS
2026.01
32.28
26.96
20.91
1.47
15.87
Feedback
Search any
task
Search any
task