Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Embodied AI on Embodied
Loading...
100
Clean Success Rate (Eager)
Llama-3.2-1B-Instruct
95
97.5
100
102.5
May 20, 2026
Clean Success Rate (Eager)
Clean Success Rate (Compiled)
Trigger Success Rate (Eager)
Trigger Success Rate (Compiled)
Updated 13d ago
Evaluation Results
Method
Method
Links
Clean Success Rate (Eager)
Clean Success Rate (Compiled)
Trigger Success Rate (Eager)
Trigger Success Rate (Compiled)
Llama-3.2-1B-Instruct
Execution Backend=CUDA...
2026.05
100
100
40
56.2
Llama-3.2-3B-Instruct
Execution Backend=CUDA...
2026.05
100
100
61.3
42.5
Qwen2.5-1.5B-Instruct
Execution Backend=CUDA...
2026.05
100
100
88.7
76.2
Qwen2.5-3B-Instruct
Execution Backend=CUDA...
2026.05
100
100
76.2
63.7
Feedback
Search any
task
Search any
task