Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BlocksWorld

Benchmarks

Task NameDataset NameSOTA ResultTrend
PlanningBlocksWorld
Success Rate100
20
Agent TaskBlocksWorld
Success Rate100
17
Agent Behavior AdaptationBlocksWorld (BW) (test)
Loop Ratio51
17
Generalized PlanningBlocksworld
Scale55
12
PlanningBlocks (Blocksworld)
Accuracy100
12
Task and Motion PlanningBlocksworld n=6
Success Rate80
4
Task and Motion PlanningBlocksworld n=5
Success Rate (SR)100
4
Task and Motion PlanningBlocksworld n=4
Success Rate (%)90
4
Task and Motion PlanningBlocksworld (n=3)
Success Rate100
4
Planning EfficiencyBlocksworld Planning
Ntokens589.3
4
Planningblocksworld-8b ML
Accuracy100
3
Next-token predictionblocksworld-8b (test)
Accuracy99.8
3
Next-token predictionblocksworld 8b (train)
Accuracy100
3
Showing 13 of 13 rows