Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-horizon tasks on Minecraft Stone
Loading...
94.53
Success Rate (SR)
EvoAgent
10.1756
32.0753
53.975
75.8747
Feb 9, 2025
Success Rate (SR)
Efficiency Error (EE)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate (SR)
Efficiency Error (EE)
EvoAgent
2025.02
94.53
96.48
LS-Imagine
2025.02
91.5
92.36
Optimus-1
2025.02
88.79
89.25
DreamerV3
2025.02
86.82
88.39
Jarvis-1
2025.02
81.91
84.72
GPT-4V
2025.02
14.39
30.64
PPO
2025.02
13.42
27.56
Feedback
Search any
task
Search any
task