Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Policy Optimization on Office World Map 1, Exp 5
Loading...
3,125
Average Training Steps
QR-MAXRM
-49,552.8
306,022.35
661,597.5
1,017,172.65
Dec 16, 2025
Average Training Steps
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Training Steps
QR-MAXRM
2025.12
3,125
QR-MAX
2025.12
24,222
QRM
2025.12
225,140
UCBVI-sB
2025.12
250,800
R-MAX
2025.12
272,080
UCBVI-B
2025.12
555,977
UCBVI-H
2025.12
1,320,070
Feedback
Search any
task
Search any
task