Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-task (Overall) on USIM (test)
Loading...
0.0359
APE
U0
0.025656
0.094803
0.16395
0.233097
Oct 9, 2025
APE
Updated 9d ago
Evaluation Results
Method
Method
Links
APE
U0
Setting=Fine-tuned
2025.10
0.0359
GR00T N1.5
Setting=Fine-tuned
2025.10
0.0374
π0.5
Setting=Fine-tuned
2025.10
0.0861
OpenVLA
Setting=Fine-tuned
2025.10
0.141
π0.5
Setting=Zero-shot
2025.10
0.1496
GR00T N1.5
Setting=Zero-shot
2025.10
0.1834
OpenVLA
Setting=Zero-shot
2025.10
0.292
Feedback
Search any
task
Search any
task