Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Alignment Evaluation on Safety Evaluation Dataset
Loading...
81
Response Safe Rate (Llama Guard Model)
DCR
64.36
68.68
73
77.32
Feb 10, 2026
Response Safe Rate (Llama Guard Model)
Response Safe Rate (OpenAI Moderation API)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Response Safe Rate (Llama Guard Model)
Response Safe Rate (OpenAI Moderation API)
DCR
2026.02
81
92
Surgical
2026.02
78
87
STL-aug
2026.02
77
88
STL
2026.02
72
86
SCANS
2026.02
65
80
Feedback
Search any
task
Search any
task