Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Safety Evaluation on BeaverTails & LMSYS-Chat (test)

97.88Rule Score

Shadow-Level

58.8869.00579.1389.255Jan 12, 2026
Updated 3mo ago

Evaluation Results

MethodLinks
2026.01
97.8896.92-0.875.17
2026.01
72.8859.04-2.365.16
2026.01
72.1260.96-2.245.16
2026.01
70.3858.46-2.425.04
2026.01
69.0456.73-2.525.14
2026.01
60.7744.23-3.075.15
2026.01
60.7745-3.035.15
2026.01
60.3843.46-3.064.75