Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Alignment Evaluation on StrongReject SR-base
Loading...
100
Safety Rate
SFT
94.6752
96.0576
97.44
98.8224
May 30, 2026
Safety Rate
Updated 1d ago
Evaluation Results
Method
Method
Links
Safety Rate
SFT
Architecture=DeepSeek-...
2026.05
100
Stair-SFT
Architecture=DeepSeek-...
2026.05
100
Stair-DPO
Architecture=DeepSeek-...
2026.05
100
MESA
Architecture=DeepSeek-...
2026.05
100
Base(instruct)
Architecture=Qwen3-30B...
2026.05
100
SFT
Architecture=Qwen3-30B...
2026.05
100
GRPO
Architecture=Qwen3-30B...
2026.05
100
Stair-DPO
Architecture=Qwen3-30B...
2026.05
100
SafeX
Architecture=Qwen3-30B...
2026.05
100
MESA
Architecture=Qwen3-30B...
2026.05
100
Stair-SFT
Architecture=Qwen3-30B...
2026.05
98.72
SafeX
Architecture=DeepSeek-...
2026.05
98.08
GRPO
Architecture=DeepSeek-...
2026.05
95.53
Base(chat)
Architecture=DeepSeek-...
2026.05
94.88
Feedback
Search any
task
Search any
task