Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Calvin

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-horizon robot manipulationCalvin ABCD→D
Task 1 Completion Rate99.4
127
Robotic ManipulationCALVIN ABCD->D
Avg Length0.4
89
Long-horizon task completionCalvin ABC->D
Success Rate (1)96.8
67
Robot ManipulationCALVIN (ABC->D)
Average Successful Length4.75
48
Sequential Robotic ManipulationCALVIN
Success Rate (1 task)99.8
45
Robotic ManipulationCALVIN D→D
Average Length4.52
40
Long-horizon robotic manipulationCALVIN ABC-D
Task 1 Success Rate98.4
34
Instruction-following robotic manipulationCALVIN ABC→D (unseen environment D)
Success Rate (Length 1)98.5
29
Robotic ManipulationCalvin ABC-D
Task-1 Score100
26
Robot ManipulationCALVIN ABC->D 1.0
Success Rate (1 Inst)96.8
18
Long-horizon language-conditioned policy learningCALVIN
Success Rate (Step 5/5)98.4
16
Long-horizon robotic manipulationCALVIN ABC→D Zero-shot
Task 1 Success Rate98.8
16
Long-horizon robot manipulationCALVIN
Task Completion Rate (1)96.3
15
Long-horizon task completionCALVIN
Success Rate (1 Task)93.8
15
Long-Horizon Multi-Task Language ControlCALVIN ABC→D (test)
Seq Success (1)96
13
Language-Conditioned ManipulationCALVIN MTLC
Success Rate95
12
Long-horizon task successCALVIN D→D long-horizon
Success Rate (LH-1)99.5
11
Robot manipulationCALVIN 10% ABCD → D
Success Rate (L=1)84.1
11
Language-conditioned manipulationCALVIN LH-MTLC
Success Rate (1 Instruction)97.5
10
Failure DetectionDSMF-CALVIN (test)
Accuracy90.64
10
Language-conditioned long-horizon robotic manipulationCALVIN ABC→D
Success Rate (1 Task)99.6
8
Language-conditioned visuomotor controlCALVIN ABC→D (Zero-shot)
Completion Rate (Seq 1)96
8
Robot ManipulationCalvin ABC -> D
Average Path Length0.45
7
Robot ManipulationCalvin D -> D
Average Length2.92
7
Track predictionCALVIN ABC → D (test)
Success Rate (δ < 4)43.7
7
Showing 25 of 47 rows