Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Harmful prompt detection on Combined Average
Loading...
90.18
F1 Score (Combined Average)
MLPM
72.4168
77.0284
81.64
86.2516
Feb 22, 2025
F1 Score (Combined Average)
Updated 1d ago
Evaluation Results
Method
Method
Links
F1 Score (Combined Average)
MLPM
Backbone=OLMo2-7B-Inst...
2025.02
90.18
WildGuard
Methodology=Guard Model
2025.02
88.93
MLPM
Backbone=Llama-8B-Inst...
2025.02
88.3
MLPM
Backbone=Mistral-7B-In...
2025.02
87.55
Abdelnabi et al.
Backbone=OLMo2-7B-Inst...
2025.02
87.35
Ayub & Majumdar
Backbone=OLMo2-7B-Inst...
2025.02
87.06
MLPM
Backbone=Qwen3-8B-Inst...
2025.02
85.95
GraniteGuardian-3-1-8B
Methodology=Guard Model
2025.02
85.62
Abdelnabi et al.
Backbone=Llama-8B-Inst...
2025.02
84.51
Abdelnabi et al.
Backbone=Qwen3-8B-Inst...
2025.02
84.36
Abdelnabi et al.
Backbone=Mistral-7B-In...
2025.02
84.32
Ayub & Majumdar
Backbone=Mistral-7B-In...
2025.02
84.31
Ayub & Majumdar
Backbone=Qwen3-8B-Inst...
2025.02
82.69
Ayub & Majumdar
Backbone=Llama-8B-Inst...
2025.02
82.09
LlamaGuard3
Methodology=Guard Model
2025.02
79.56
Aegis-Guard-D
Methodology=Guard Model
2025.02
78.82
ShieldGemma-9B
Methodology=Guard Model
2025.02
73.1
Feedback
Search any
task
Search any
task