Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Text-based safety moderation on Aegis
Loading...
84
F1 Score
OMNIGUARD-7B
51.76
60.13
68.5
76.87
Dec 2, 2025
F1 Score
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
Accuracy
OMNIGUARD-7B
Size=7B
2025.12
84
84.1
Qwen3-235B
Size=235B
2025.12
83.3
83.6
OMNIGUARD-3B
Size=3B
2025.12
82.2
82.6
Qwen2.5-Omni-7B
Size=7B
2025.12
77.3
73.9
Qwen2.5-7B
Size=7B
2025.12
75.4
77.5
ThinkGuard
Size=8B
2025.12
69.9
74.6
Qwen2.5-72B
Size=72B
2025.12
68.9
73.2
LLaMA Guard 3
Size=8B
2025.12
65.8
73.5
LLaMA-3.3-70B
Size=70B
2025.12
65.7
69.4
LLaMA Guard 2
Size=8B
2025.12
59
69.2
GPT-4o
Size=-
2025.12
54.3
65.6
LLaMA Guard 1
Size=7B
2025.12
53
66.9
Feedback
Search any
task
Search any
task