Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Stack

Benchmarks

Task NameDataset NameSOTA ResultTrend
Abstractive SummarizationStack ConvoSumm 1.0 (test)
ROUGE-139.73
11
Object StackingStack Composition C (test)
Success Rate93.7
10
Object StackingStack Spuriousness S (test)
Success Rate97.6
10
Object StackingStack In-distribution I (test)
Success Rate97.2
10
Robotic ManipulationStack Shifted Environment (test)
Testing Reward0.77
8
Task PlanningStack 1.0 (test)
Average Planning Time Cost (s)5.94
3
Showing 6 of 6 rows