Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Classification on GuardSet (test)
Loading...
96.26
Accuracy (Harmless)
GPT-4o-mini
55.5128
66.0914
76.67
87.2486
Apr 8, 2026
Accuracy (Harmless)
Accuracy (Harmful)
Accuracy (Average)
Updated 9d ago
Evaluation Results
Method
Method
Links
Accuracy (Harmless)
Accuracy (Harmful)
Accuracy (Average)
GPT-4o-mini
Evaluation Paradigm=Gu...
2026.04
96.26
80.06
88.16
GPT-4o
Evaluation Paradigm=Gu...
2026.04
95.41
87.39
91.4
GuardAdvisor
Evaluation Paradigm=Gu...
2026.04
95.08
85.95
90.52
Granite-Guardian
Evaluation Paradigm=Bi...
2026.04
92.07
89.06
90.57
WildGuard
Evaluation Paradigm=Bi...
2026.04
91.67
89.06
90.37
Llama-Guard-4
Evaluation Paradigm=Bi...
2026.04
64.35
94.21
79.28
Llama-Guard-3
Evaluation Paradigm=Bi...
2026.04
57.08
96.09
76.59
Feedback
Search any
task
Search any
task