Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Claude

Benchmarks

Task NameDataset NameSOTA ResultTrend
Jailbreak AttackClaude 3.5
ASR0
19
Jailbreak AttackClaude Sonnet API 3.5
ASR80.5
16
Black-box Adversarial AttackClaude thinking 4.0
KMR (a)0.02
9
JailbreakingClaude 4.5
ASR97
9
AI-generated text detectionClaude-generated (test)
F1 Score92.2
5
Showing 5 of 5 rows