Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DROID

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-step manipulationDROID Tabletop Multi-step tasks
Success Rate98
18
Semantic reasoning manipulationDROID Tabletop Semantic tasks
Success Rate26
18
Rearrangement with distractorsDROID Tabletop Distractor tasks
Success Rate27
18
Predicate DetectionDROID phasic episodes (test)
F1 Score96.8
12
Pick-and-placeDROID Tabletop Simple tasks
Success Rate27
12
Video DepthDROID
Abs Rel0.223
8
Robot Policy LearningDROID Franka Panda
Average Success Rate47.4
7
Video-to-Video GenerationDroid (test)
VBench0.81
6
Interactive long-trajectory generationDROID (val)
PSNR23.56
6
Video Frame Rank-CorrelationDROID
VOC Rank-Correlation (Sparse)0.99
6
Wipe Table with TowelReal-world DROID platform
Success Rate95
5
Remove Marker from BowlDROID platform Real-world
Success Rate35
5
Put Marker into BowlDROID Real-world
Success Rate55
5
Put Cube into CupDROID Real-world
Success Rate65
5
Decision Tree Rashomon Set constructionDroid
Runtime (s)0.78
5
4D Scene GenerationDroid Realistic (test)
FVD31.82
5
Autoregressive rolloutDROID External Camera (val)
SSIM86
5
Camera TrackingDROID-W
Error Rate (Downtown 1)0.1
5
Temporal Value EstimationDROID (test)
VOC+93.67
5
SegmentationDROID internal held-out
Dice Coefficient76.7
5
Rashomon set approximationDroid-84
Recall100
4
Monocular Depth EstimationDROID (unseen domain)
Abs Rel0.237
4
Dynamic Affordance PredictionDROID 70/30 (test)
Open Microwave MAE37
4
Video generationDROID (Unseen Scene)
PSNR19.73
4
Video generationDROID Unseen Camera Viewpoint
PSNR20.87
4
Showing 25 of 34 rows