Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reinforcement Learning on MiniHack Corridor-5

1Mean Return

PPO-RNN

0.0640.3070.550.793Mar 10, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
10.2952
2026.03
10.2952
2026.03
10.1652.1
2026.03
10.2152
2026.03
10.2152
2026.03
10.2153.8
2026.03
10.2153.8
2026.03
10.8494.6
2026.03
10.8494.6
2026.03
10.6380.7
2026.03
10.7692
2026.03
10.7692
2026.03
10.6588
2026.03
10.6588
2026.03
0.9-161.3
2026.03
0.9-161.3
2026.03
0.9-118.3
2026.03
0.9-113.6
2026.03
0.9-113.6
2026.03
0.9-123.6
2026.03
0.9-123.6
2026.03
0.30.6380.7
2026.03
0.20.1652.1
2026.03
0.1-118.3