Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Policy Optimization on Office World MAP4
Loading...
5,630
Average Training Steps
QR-MAXRM
-390,558.4
2,283,713.3
4,957,985
7,632,256.7
Dec 16, 2025
Average Training Steps
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Training Steps
QR-MAXRM
Exp=EXP6, Map Size=15x15
2025.12
5,630
QR-MAX
Exp=EXP6, Map Size=15x15
2025.12
20,150
UCBVI-sB
Exp=EXP6, Map Size=15x15
2025.12
80,160
R-MAX
Exp=EXP6, Map Size=15x15
2025.12
82,000
UCBVI-B
Exp=EXP6, Map Size=15x15
2025.12
90,030
UCBVI-H
Exp=EXP6, Map Size=15x15
2025.12
101,120
QRM
Exp=EXP6, Map Size=15x15
2025.12
9,910,340
Feedback
Search any
task
Search any
task