Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Alignment Evaluation on StrongReject SR-PAPL
Loading...
100
Safety Rate
MESA
78.3992
84.0071
89.615
95.2229
May 30, 2026
Safety Rate
Updated 1d ago
Evaluation Results
Method
Method
Links
Safety Rate
MESA
Architecture=DeepSeek-...
2026.05
100
Stair-DPO
Architecture=Qwen3-30B...
2026.05
99.68
MESA
Architecture=Qwen3-30B...
2026.05
99.68
Stair-DPO
Architecture=DeepSeek-...
2026.05
99.36
SFT
Architecture=Qwen3-30B...
2026.05
99.36
Stair-SFT
Architecture=Qwen3-30B...
2026.05
99.04
SafeX
Architecture=Qwen3-30B...
2026.05
99.04
Stair-SFT
Architecture=DeepSeek-...
2026.05
98.08
GRPO
Architecture=Qwen3-30B...
2026.05
98.08
SFT
Architecture=DeepSeek-...
2026.05
95.53
SafeX
Architecture=DeepSeek-...
2026.05
90.74
Base(instruct)
Architecture=Qwen3-30B...
2026.05
83.71
GRPO
Architecture=DeepSeek-...
2026.05
79.55
Base(chat)
Architecture=DeepSeek-...
2026.05
79.23
Feedback
Search any
task
Search any
task