Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Evaluation on PAP
Loading...
93.65
Safety Score
REFLECTOR (+GDPO)
34.0476
49.5213
64.995
80.4687
May 20, 2026
Safety Score
Updated 13d ago
Evaluation Results
Method
Method
Links
Safety Score
REFLECTOR (+GDPO)
Backbone=Llama-3.1-8B-...
2026.05
93.65
REFLECTOR (+SFT)
Backbone=Llama-3.1-8B-...
2026.05
92.69
REFLECTOR (+GDPO)
Backbone=Qwen-2.5-7B-I...
2026.05
91.34
REFLECTOR (SFT)
Backbone=Qwen-2.5-7B-I...
2026.05
86.96
STAIR
Backbone=Llama-3.1-8B-...
2026.05
85.35
STAIR
Backbone=Qwen-2.5-7B-I...
2026.05
82.53
Shallow-Align
Backbone=Llama-3.1-8B-...
2026.05
78.2
Shallow-Align
Backbone=Qwen-2.5-7B-I...
2026.05
76.1
Self-Critique
Backbone=Qwen-2.5-7B-I...
2026.05
48.07
Self-Critique
Backbone=Llama-3.1-8B-...
2026.05
45.6
DPO
Backbone=Qwen-2.5-7B-I...
2026.05
44.03
DPO
Backbone=Llama-3.1-8B-...
2026.05
41.7
SFT
Backbone=Llama-3.1-8B-...
2026.05
40.1
SFT
Backbone=Qwen-2.5-7B-I...
2026.05
38.46
Original
Backbone=Llama-3.1-8B-...
2026.05
38.28
Original
Backbone=Qwen-2.5-7B-I...
2026.05
36.34
Feedback
Search any
task
Search any
task