Share your thoughts, 1 month free Claude Pro on usSee more

Agentic Task Success on MiniHack

20Success Rate

ALMA

Updated 4mo ago

Evaluation Results

Method	Links
ALMA 2026.02		20
Trajectory Retrieval 2026.02		16.7
Reasoning Bank 2026.02		15.8
No Memory 2026.02		15
G-Memory 2026.02		14.2
ALMA 2026.02		11.7
Dynamic Cheatsheet 2026.02		11.7
Reasoning Bank 2026.02		9.8
Dynamic Cheatsheet 2026.02		9.2
Trajectory Retrieval 2026.02		7.5
G-Memory 2026.02		6.8
No Memory 2026.02		6.7