Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mousetrap

Benchmarks

Task NameDataset NameSOTA ResultTrend
Harmfulness EvaluationMousetrap
Harmfulness Score3.78
22
Jailbreak RobustnessMousetrap
Harmfulness Rate0
17
Jailbreak Attack DefenseMousetrap
FFR (Reasoning)22.7
17
Showing 3 of 3 rows