Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VLABench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Success Rate EvaluationVLABench
Average Success Rate46.3
19
Robot ManipulationVLABench
Toy Success Rate70
5
Language-conditioned visual reasoningVLABench official (test)
Precision Score (Toy)76
4
Robotic Task PlanningVLABench
Toy Success Rate54
4
Language-conditioned visual reasoningVLABench
SR (Toy)54
4
Robotic ManipulationVLABench 5 public tracks v1.0
IS (In-dist)79.8
3
Robot ManipulationVLABench Cross Category
Add Condiment Success Rate14
2
Robot ManipulationVLABench In Distribution
Add Condiment Success63
2
Showing 8 of 8 rows