Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Efficiency Analysis on Alignment (train)
Loading...
5
Training Time
RLOO
-16
125.75
267.5
409.25
Mar 5, 2025
Training Time
Updated 4d ago
Evaluation Results
Method
Method
Links
Training Time
RLOO
Training Mode=Single-I...
2025.03
5
ZOPrO
Training Mode=Single-I...
2025.03
120
DPO
Training Mode=Single-I...
2025.03
149
PPO
Training Mode=Single-I...
2025.03
530
Feedback
Search any
task
Search any
task