Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Alignment on Safety Alignment Dataset 1-order (test)
Loading...
100
DSR
MOSAIC-2
60.168
70.509
80.85
91.191
Mar 17, 2026
DSR
OR
Updated 1mo ago
Evaluation Results
Method
Method
Links
DSR
OR
MOSAIC-2
Model=Llama-3.2-3B, #...
2026.03
100
10.4
MOSAIC-2
Model=Llama-3.1-8B, #...
2026.03
99.8
7.1
MOSAIC-5
Model=Llama-3.1-8B, #...
2026.03
99.8
4.3
MOSAIC-5
Model=Llama-3.2-3B, #...
2026.03
99.6
8.4
SFT
Model=Llama-3.1-8B, #...
2026.03
99.4
7.3
SFT
Model=Llama-3.2-3B, #...
2026.03
99.1
10.3
ORPO
Model=Llama-3.1-8B, #...
2026.03
79.8
29.1
In-context
Model=Llama-3.1-8B, #...
2026.03
76.7
10.2
ORPO
Model=Llama-3.2-3B, #...
2026.03
74.3
29.4
In-context
Model=Llama-3.2-3B, #...
2026.03
61.7
12.3
Feedback
Search any
task
Search any
task