Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-horizon tasks on Minecraft Iron
Loading...
51.82
Success Rate (SR)
EvoAgent
-2.0728
11.9186
25.91
39.9014
Feb 9, 2025
Success Rate (SR)
Efficiency Error (EE)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate (SR)
Efficiency Error (EE)
EvoAgent
2025.02
51.82
58.54
Optimus-1
2025.02
45.48
46.16
Jarvis-1
2025.02
42.38
47.52
LS-Imagine
2025.02
35.82
38.27
DreamerV3
2025.02
33.79
35.68
PPO
2025.02
0
0
GPT-4V
2025.02
0
0
Feedback
Search any
task
Search any
task