Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Calvin

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-horizon robot manipulationCalvin ABCD→D
Task 1 Completion Rate99.4
96
Long-horizon task completionCalvin ABC->D
Success Rate (1)96.8
67
Robot ManipulationCALVIN (ABC->D)
Average Successful Length4.75
36
Instruction-following robotic manipulationCALVIN ABC→D (unseen environment D)
Success Rate (Length 1)98.5
29
Robotic ManipulationCALVIN ABCD->D
Success Rate (1 Inst)99.7
26
Robot ManipulationCALVIN ABC->D 1.0
Success Rate (1 Inst)96.8
18
Robotic ManipulationCalvin ABC-D
Task-1 Score100
16
Long-horizon robotic manipulationCALVIN ABC→D Zero-shot
Task 1 Success Rate98.8
16
Long-horizon robot manipulationCALVIN
Task Completion Rate (1)96.3
15
Long-horizon task completionCALVIN
Success Rate (1 Task)93.8
15
Long-Horizon Multi-Task Language ControlCALVIN ABC→D (test)
Seq Success (1)96
13
Robotic ManipulationCALVIN D→D
Success Rate (Length 1)93.7
12
Long-horizon task successCALVIN D→D long-horizon
Success Rate (LH-1)99.5
11
Robot manipulationCALVIN 10% ABCD → D
Success Rate (L=1)84.1
11
Failure DetectionDSMF-CALVIN (test)
Accuracy90.64
10
Language-conditioned visuomotor controlCALVIN ABC→D (Zero-shot)
Completion Rate (Seq 1)96
8
Track predictionCALVIN ABC → D (test)
Success Rate (δ < 4)43.7
7
Robot ManipulationCALVIN D->D
Average Successful Length2.92
6
Video GenerationCalvin (val)
PSNR19.95
5
Robotic ManipulationCALVIN 10% of Env D
No-RGB Success Rate65.2
4
Multitask Imitation LearningCALVIN Enriched instructions (D)
Success Rate (Task 1)73.7
4
Long-horizon robot manipulationCALVIN unseen lang
Task Completion Rate (1 Task)76.4
4
Long-horizon robot manipulationCALVIN 10% data
Task 1 Completion Rate77.8
4
Multi-task Robotic ManipulationCALVIN (test)
Success Rate (1 task)80.3
4
ManipulationCALVIN ABC -> D
Success Rate92.2
3
Showing 25 of 26 rows