Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Harmfulness Evaluation on PKU-SafeRLHF
Loading...
-1.11
Beaver-7B-Cost Score
DLMA
-1.4056
0.5897
2.585
4.5803
Feb 19, 2024
Beaver-7B-Cost Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Beaver-7B-Cost Score
DLMA
Model Size=13B
2024.02
-1.11
RCLD
Model Size=13B
2024.02
-0.14
CD
Model Size=13B
2024.02
0.04
DLMA
Model Size=7B
2024.02
1.92
RCLD
Model Size=7B
2024.02
3.32
CD
Model Size=7B
2024.02
3.58
RLAIF
Model Size=13B
2024.02
5.13
Llama2
Model Size=13B
2024.02
6.05
RLAIF
Model Size=7B
2024.02
6.12
Llama2
Model Size=7B
2024.02
6.28
Feedback
Search any
task
Search any
task