Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AutoRAN

Benchmarks

Task NameDataset NameSOTA ResultTrend
Harmfulness EvaluationAutoRAN
Harmfulness Score1.32
22
Jailbreak RobustnessAutoRAN
Harmfulness Rate0
17
Jailbreak Attack DefenseAutoRAN
Reasoning Failure Rate (FFR)0
17
Showing 3 of 3 rows