Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Calvin

Benchmarks

Task NameDataset NameSOTA ResultTrend
Robotic ManipulationCALVIN ABCD->D
Avg Length0.4
130
Long-horizon robot manipulationCalvin ABCD→D
Task 1 Completion Rate99.4
127
Robotic ManipulationCalvin ABC-D
Task-1 Score100
71
Long-horizon task completionCalvin ABC->D
Success Rate (1)96.8
67
Sequential Robotic ManipulationCALVIN
Success Rate (1 task)99.8
63
Robot ManipulationCALVIN (ABC->D)
Average Successful Length4.75
62
Long-horizon robotic manipulationCALVIN ABC-D
Average Trajectory Length0.27
40
Robotic ManipulationCALVIN D→D
Average Length4.52
40
Instruction-following robotic manipulationCALVIN ABC→D (unseen environment D)
Success Rate (Length 1)98.5
29
Robot ManipulationCALVIN ABC->D 1.0
Success Rate (1 Inst)96.8
18
Long-horizon Robot ManipulationCALVIN long-horizon
Success Rate 196.9
17
Long-horizon language-conditioned policy learningCALVIN
Success Rate (Step 5/5)98.4
16
Long-horizon robotic manipulationCALVIN ABC→D Zero-shot
Task 1 Success Rate98.8
16
Long-horizon robot manipulationCALVIN
Task Completion Rate (1)96.3
15
Long-horizon task completionCALVIN
Success Rate (1 Task)93.8
15
Robotic ManipulationCALVIN
Average Length2.55
13
Long-Horizon Multi-Task Language ControlCALVIN ABC→D (test)
Seq Success (1)96
13
Long-horizon language-conditioned manipulationCalvin ABC→D
Success Rate (Seq 1)97.3
12
Language-Conditioned ManipulationCALVIN MTLC
Success Rate95
12
Long-horizon task successCALVIN D→D long-horizon
Success Rate (LH-1)99.5
11
Robot manipulationCALVIN 10% ABCD → D
Success Rate (L=1)84.1
11
turn off lightbulbCALVIN
Success Rate100
10
Language-conditioned manipulationCALVIN LH-MTLC
Success Rate (1 Instruction)97.5
10
Failure DetectionDSMF-CALVIN (test)
Accuracy90.64
10
Language-conditioned Robotic Instruction FollowingCALVIN ABC→D
Success Rate (1 Task)98.9
8
Showing 25 of 59 rows