Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Policy optimization on Office World Map 3, Exp 5
Loading...
5,806
Average Training Steps
QR-MAXRM
-75,161.76
471,370.62
1,017,903
1,564,435.38
Dec 16, 2025
Average Training Steps
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Training Steps
QR-MAXRM
2025.12
5,806
QR-MAX
2025.12
159,597
QRM
2025.12
891,160
UCBVI-sB
2025.12
1,410,000
R-MAX
2025.12
1,581,301
UCBVI-B
2025.12
1,620,000
UCBVI-H
2025.12
2,030,000
Feedback
Search any
task
Search any
task