Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Alignment on Safety Alignment Dataset 2-order (test)
Loading...
99.8
DSR
MOSAIC-5
49.36
62.455
75.55
88.645
Mar 17, 2026
DSR
OR
Updated 1mo ago
Evaluation Results
Method
Method
Links
DSR
OR
MOSAIC-5
Model=Llama-3.1-8B, #...
2026.03
99.8
4.3
MOSAIC-2
Model=Llama-3.1-8B, #...
2026.03
99.6
7.5
SFT
Model=Llama-3.1-8B, #...
2026.03
99.5
6.6
MOSAIC-5
Model=Llama-3.2-3B, #...
2026.03
99.3
3.3
MOSAIC-2
Model=Llama-3.2-3B, #...
2026.03
99.2
6.3
SFT
Model=Llama-3.2-3B, #...
2026.03
98.8
7.4
ORPO
Model=Llama-3.2-3B, #...
2026.03
78.1
31.4
ORPO
Model=Llama-3.1-8B, #...
2026.03
75.3
28.9
In-context
Model=Llama-3.1-8B, #...
2026.03
62
13.6
In-context
Model=Llama-3.2-3B, #...
2026.03
51.3
10.9
Feedback
Search any
task
Search any
task