Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Jailbreak Defense on ReNeLLM & DeepInception Average

1Harmful Score

IA

0.8541.83952.8253.8105May 17, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.05
10
2025.05
10
2025.05
10
2025.05
10
2025.05
10
2025.05
10
2025.05
1.011
2025.05
1.039
2025.05
1.042
2025.05
1.043
2025.05
1.134
2025.05
1.3542
2025.05
1.412
2025.05
1.4216
2025.05
1.4311
2025.05
1.4813
2025.05
1.5436
2025.05
1.8624
2025.05
1.9726
2025.05
4.2985
2025.05
4.6597