Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Misuse Detection on Misuse Categories Scam (Elections)
Loading...
0.99
AUC
GAVEL
0.366
0.528
0.69
0.852
Jan 27, 2026
AUC
Balanced Accuracy
FPR
Updated 1mo ago
Evaluation Results
Method
Method
Links
AUC
Balanced Accuracy
FPR
GAVEL
Category=Classifier, B...
2026.01
0.99
98
2
Activation Classifier
Category=Classifier, B...
2026.01
0.9
75
12
Perspective (Google)
Category=Moderation, B...
2026.01
0.89
50
1
CAST
Category=Inference-Tim...
2026.01
0.82
66
67
Llama Guard 4 (Meta)
Category=Moderation, B...
2026.01
0.79
88
7
RepBending
Category=Fine-Tuning,...
2026.01
0.5
50
0
Moderator (OpenAI)
Category=Moderation, B...
2026.01
0.5
50
0
Circuit Breakers
Category=Fine-Tuning,...
2026.01
0.42
42
15
JBShield
Category=Inference-Tim...
2026.01
0.39
53
0
Feedback
Search any
task
Search any
task