Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Alignment Evaluation on StrongReject SR-Pair
Loading...
98.72
Safety Rate
Stair-DPO
50.2144
62.8072
75.4
87.9928
May 30, 2026
Safety Rate
Updated 1d ago
Evaluation Results
Method
Method
Links
Safety Rate
Stair-DPO
Architecture=Qwen3-30B...
2026.05
98.72
Stair-SFT
Architecture=Qwen3-30B...
2026.05
96.59
GRPO
Architecture=Qwen3-30B...
2026.05
91.37
MESA
Architecture=Qwen3-30B...
2026.05
90.73
SFT
Architecture=Qwen3-30B...
2026.05
87.22
SafeX
Architecture=Qwen3-30B...
2026.05
86.58
Stair-DPO
Architecture=DeepSeek-...
2026.05
76.36
SFT
Architecture=DeepSeek-...
2026.05
75.08
MESA
Architecture=DeepSeek-...
2026.05
73.48
Stair-SFT
Architecture=DeepSeek-...
2026.05
72.03
Base(instruct)
Architecture=Qwen3-30B...
2026.05
66.77
SafeX
Architecture=DeepSeek-...
2026.05
56.87
GRPO
Architecture=DeepSeek-...
2026.05
55.27
Base(chat)
Architecture=DeepSeek-...
2026.05
52.08
Feedback
Search any
task
Search any
task