Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Plancraft

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-agent system task solvingplancraft
Accuracy88.7
21
World ModelingPlancraft
Smelt98.4
20
Showing 2 of 2 rows