Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Pull on Physical Simulator Out-of-Domain evaluation
Loading...
85.8
Success Rate
RL w. World Model
7.176
27.588
48
68.412
Dec 3, 2025
Success Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate
RL w. World Model
Policy=OpenVLA
2025.12
85.8
RL w. World Model
Policy=MLP
2025.12
51.3
RL w. ManiSkill
Policy=OpenVLA
2025.12
39.6
RL w. ManiSkill
Policy=MLP
2025.12
15.5
Supervised Fintune
Policy=OpenVLA
2025.12
12.5
Supervised Fintune
Policy=MLP
2025.12
10.2
Feedback
Search any
task
Search any
task