Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety risk detection on TS-Bench
Loading...
86
Overall F1 Score
Breeze Guard
68.32
72.91
77.5
82.09
Mar 7, 2026
Overall F1 Score
SCAM Score
FIN Score
HEALTH Score
GENDER Score
GROUP Score
POL Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Overall F1 Score
SCAM Score
FIN Score
HEALTH Score
GENDER Score
GROUP Score
POL Score
Breeze Guard
Inference mode=non-thi...
2026.03
86
85
80
87
88
98
97
Breeze Guard
Inference mode=thinking
2026.03
84
93
73
87
89
93
95
Granite Guardian 3.3
Model scale=8B
2026.03
69
18
38
80
89
86
100
Feedback
Search any
task
Search any
task