Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Safety Evaluation on SafetyBench en

81.2Avg Score

ROSE

34.9246.93558.9570.965Feb 19, 2024May 7, 2024Jul 25, 2024Oct 11, 2024Dec 29, 2024Mar 17, 2025Jun 4, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2024.02
81.282.78888.676.789.684.464.4
2024.02
80.882.987.8897688.683.863.8
2024.02
7978.581.286.578.480.479.470.8
2024.02
7877.68285.976.77978.968.4
2024.02
74.371.974.283.771.376.574.470.7
2024.02
73.8717683.666.176.37670.5
2024.02
73.472.276.380.96974.177.467
2024.02
72.771.776.379.96872.176.566.5
2025.06
7268.474.277.569.572.575.968.2
2025.06
71.968.474.377.469.172.775.868.2
2025.06
71.567.973.57768.971.875.367.4
2025.06
70.467.472.678.270.669.676.861
2024.02
69.569.273.678.461.769.272.364.5
2024.02
68.264.872.276.964.862.472.264.8
2024.02
66.765.568.874.666.662.171.758.7
2024.02
66.165.168.474.366.660.872.456.3
2024.02
66.16469.77865.164.271.653.4
2024.02
65.262.869.876.857.863.171.757.6
2024.02
62.65961.870.462.85862.563.1
2024.02
61.96061.77063.757.462.358
2024.02
60.759.257.567.664.758.463.355.3
2024.02
60.259.556.668.564.257.863.253
2024.02
57.853.258.566.257.156.460.654.4
2024.02
56.851.359.26553.454.761.254.8
2024.02
36.736.4262849.534.527.649.9