Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Alignment on Safety Alignment Dataset 4-order (test)
Loading...
100
DSR
MOSAIC-2
39.576
55.263
70.95
86.637
Mar 17, 2026
DSR
OR
Updated 1mo ago
Evaluation Results
Method
Method
Links
DSR
OR
MOSAIC-2
Model=Llama-3.1-8B, #...
2026.03
100
5.9
MOSAIC-5
Model=Llama-3.1-8B, #...
2026.03
100
1.8
MOSAIC-5
Model=Llama-3.2-3B, #...
2026.03
99.9
2.9
MOSAIC-2
Model=Llama-3.2-3B, #...
2026.03
99.7
5
SFT
Model=Llama-3.1-8B, #...
2026.03
98.9
6.1
SFT
Model=Llama-3.2-3B, #...
2026.03
98.9
5.2
ORPO
Model=Llama-3.1-8B, #...
2026.03
76.4
28.7
ORPO
Model=Llama-3.2-3B, #...
2026.03
72.9
29.1
In-context
Model=Llama-3.1-8B, #...
2026.03
44.5
13.9
In-context
Model=Llama-3.2-3B, #...
2026.03
41.9
12.3
Feedback
Search any
task
Search any
task