Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Alignment Evaluation on StrongReject SR-PAPA
Loading...
100
Safety Rate
MESA
67.1048
75.6449
84.185
92.7251
May 30, 2026
Safety Rate
Updated 1d ago
Evaluation Results
Method
Method
Links
Safety Rate
MESA
Architecture=DeepSeek-...
2026.05
100
GRPO
Architecture=Qwen3-30B...
2026.05
99.68
Stair-SFT
Architecture=Qwen3-30B...
2026.05
99.68
Stair-DPO
Architecture=Qwen3-30B...
2026.05
99.68
MESA
Architecture=Qwen3-30B...
2026.05
99.68
SafeX
Architecture=Qwen3-30B...
2026.05
98.72
SFT
Architecture=Qwen3-30B...
2026.05
97.76
Stair-DPO
Architecture=DeepSeek-...
2026.05
96.17
Stair-SFT
Architecture=DeepSeek-...
2026.05
93.29
SFT
Architecture=DeepSeek-...
2026.05
91.05
Base(instruct)
Architecture=Qwen3-30B...
2026.05
91.05
SafeX
Architecture=DeepSeek-...
2026.05
87.54
Base(chat)
Architecture=DeepSeek-...
2026.05
69.01
GRPO
Architecture=DeepSeek-...
2026.05
68.37
Feedback
Search any
task
Search any
task