Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MiniHack

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reinforcement LearningMiniHack Corridor-5
Mean Return1
24
Agentic Task SuccessMiniHack
Success Rate20
12
Multi-task Reinforcement LearningMinihack Room
Success Rate78
6
Hierarchical PlanningMinihack WoD
Token Cost4,176
6
Hierarchical PlanningMinihack Monster
Token Cost0
6
Hierarchical PlanningMinihack-Traps
Token Cost0
6
Hierarchical PlanningMinihack 15x15
Token Cost0
6
Hierarchical PlanningMinihack 5x5
Token Cost1,115
6
Reinforcement LearningMiniHack
River Score1
4
ExplorationMiniHack
RPI (%)35.29
1
Showing 10 of 10 rows