Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Prosocial Alignment on PKUSafeRLHF (test)
Loading...
81.5
MIP
PROSOCIALALIGN
49.052
57.476
65.9
74.324
Dec 6, 2025
MIP
Updated 4d ago
Evaluation Results
Method
Method
Links
MIP
PROSOCIALALIGN
Backbone=llama
2025.12
81.5
PROATTR-GEN-PCA
Backbone=llama
2025.12
73.5
PP
Backbone=llama
2025.12
71.7
PROSOCIALALIGN
Backbone=mistral
2025.12
71.5
PV-ARM-SUM
Backbone=llama
2025.12
69.2
PROATTR-GEN-PCA
Backbone=mistral
2025.12
68.7
DIREG
Backbone=llama
2025.12
68.5
CTRL-GEN
Backbone=llama
2025.12
65.8
DIREG
Backbone=mistral
2025.12
65.2
CTRL-GEN
Backbone=mistral
2025.12
64.3
PP
Backbone=mistral
2025.12
62.7
SAFE-ARITH
Backbone=llama
2025.12
56.6
SAFE-ARITH
Backbone=mistral
2025.12
51.3
PV-ARM-SUM
Backbone=mistral
2025.12
50.3
Feedback
Search any
task
Search any
task