Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Offline Goal-Conditioned Reinforcement Learning on puzzle-4x6-1B
Loading...
9,100
Success Rate
NS
-260
2,170
4,600
7,030
Dec 11, 2025
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
NS
Backup=n-step return b...
2025.12
9,100
DQC
Method=Decoupled Q-chu...
2025.12
8,300
SHARSA
2025.12
6,400
DQC-naïve
Execution=partial acti...
2025.12
3,300
QC
Strategy=Q-chunking
2025.12
2,800
OS
Backup=1-step TD-backup
2025.12
1,900
HIQL
2025.12
900
IQL
2025.12
600
HFBC
2025.12
400
FBC
2025.12
100
Feedback
Search any
task
Search any
task