Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Misuse Detection on Misuse Categories Cybercrime (SQL Injection)
Loading...
99
AUC
Activation Classifier
21
41.25
61.5
81.75
Jan 27, 2026
AUC
Balanced Accuracy
FPR
Updated 1mo ago
Evaluation Results
Method
Method
Links
AUC
Balanced Accuracy
FPR
Activation Classifier
Category=Classifier, B...
2026.01
99
97
3
GAVEL
Category=Classifier, B...
2026.01
98
94
0
RepBending
Category=Fine-Tuning,...
2026.01
97
97
6
Moderator (OpenAI)
Category=Moderation, B...
2026.01
93
93
0
Circuit Breakers
Category=Fine-Tuning,...
2026.01
90
90
0
Llama Guard 4 (Meta)
Category=Moderation, B...
2026.01
76
89
3
CAST
Category=Inference-Tim...
2026.01
60
51
33
Perspective (Google)
Category=Moderation, B...
2026.01
52
53
0
JBShield
Category=Inference-Tim...
2026.01
24
58
0
Feedback
Search any
task
Search any
task