Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Malicious Prompt Detection on Babelscape ALERT
Loading...
99.73
Accuracy
Enhanced Filtering and Summarization System
8.6676
32.3088
55.95
79.5912
May 2, 2025
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Enhanced Filtering and Summarization System
Number of Prompts=14500
2025.05
99.73
Logistic Regression
Number of Prompts=14500
2025.05
72.61
Toxic-BERT
Number of Prompts=14500
2025.05
28.81
Hate Speech Detector
Number of Prompts=14500
2025.05
12.17
Feedback
Search any
task
Search any
task