Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Evaluation on Do-Not
Loading...
89.46
Safety Score
REFLECTOR (+GDPO)
57.3344
65.6747
74.015
82.3553
May 20, 2026
Safety Score
Updated 13d ago
Evaluation Results
Method
Method
Links
Safety Score
REFLECTOR (+GDPO)
Backbone=Qwen-2.5-7B-I...
2026.05
89.46
REFLECTOR (SFT)
Backbone=Qwen-2.5-7B-I...
2026.05
85.83
REFLECTOR (+GDPO)
Backbone=Llama-3.1-8B-...
2026.05
84.7
STAIR
Backbone=Qwen-2.5-7B-I...
2026.05
82.01
REFLECTOR (+SFT)
Backbone=Llama-3.1-8B-...
2026.05
80.2
STAIR
Backbone=Llama-3.1-8B-...
2026.05
78.5
Shallow-Align
Backbone=Qwen-2.5-7B-I...
2026.05
76.3
Shallow-Align
Backbone=Llama-3.1-8B-...
2026.05
74.3
DPO
Backbone=Llama-3.1-8B-...
2026.05
65.89
Self-Critique
Backbone=Llama-3.1-8B-...
2026.05
65.2
DPO
Backbone=Qwen-2.5-7B-I...
2026.05
63.85
Self-Critique
Backbone=Qwen-2.5-7B-I...
2026.05
61.87
Original
Backbone=Qwen-2.5-7B-I...
2026.05
60.91
SFT
Backbone=Llama-3.1-8B-...
2026.05
60.58
SFT
Backbone=Qwen-2.5-7B-I...
2026.05
59.64
Original
Backbone=Llama-3.1-8B-...
2026.05
58.57
Feedback
Search any
task
Search any
task