Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Exploration on ALFWorld
Loading...
94.9
Success Rate (Last Epoch)
MemRL
77.012
81.656
86.3
90.944
Jan 6, 2026
Success Rate (Last Epoch)
Cumulative Success Rate (CSR)
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate (Last Epoch)
Cumulative Success Rate (CSR)
MemRL
Model=GPT-5-mini
2026.01
94.9
0.981
Self-RAG
Model=GPT-5-mini
2026.01
90.7
0.962
Mem0
Model=GPT-5-mini
2026.01
89.4
0.969
RAG
Model=GPT-5-mini
2026.01
88.7
0.93
MemP
Model=GPT-5-mini
2026.01
88.5
0.919
No Memory
Model=GPT-5-mini
2026.01
77.7
-
Pass@10
Model=GPT-5-mini
2026.01
-
0.928
Feedback
Search any
task
Search any
task