Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Red-teaming Safety Evaluation on Basebench

1.73HS

Meta-Llama-3.1-8B (AART-aligned)

1.68561.98532.2852.5847May 30, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.05
1.730.060.02
2025.05
1.740.050.02
2025.05
1.76-0.04
2025.05
1.85-0.08
2025.05
1.860.030.03
2025.05
1.87-0.03
2025.05
1.91-0.06
2025.05
1.97-0.1
2025.05
1.980.170.08
2025.05
2.11-0.1
2025.05
2.190.240.09
2025.05
2.28-0.12
2025.05
2.310.270.12
2025.05
2.620.410.14
2025.05
2.76-0.19
2025.05
2.840.480.18