Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Harmful Conversation Detection on ProsocialDialog 652-conversation (test)
Loading...
98
Precision
LlamaGuard
81.36
85.68
90
94.32
Jan 24, 2026
Precision
Recall
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Precision
Recall
F1 Score
LlamaGuard
model_size=8B
2026.01
98
50
66
Keyword
2026.01
95
32
48
Lattice
2026.01
90
93
91
NeMo
2026.01
82
93
87
Feedback
Search any
task
Search any
task