Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agent Safety Evaluation on AgentHarm (held-out)
Loading...
12.5
HCR
FATE
10.124
26.162
42.2
58.238
May 12, 2026
HCR
VRR
SafeScore
Updated 21d ago
Evaluation Results
Method
Method
Links
HCR
VRR
SafeScore
FATE
Backbone=Qwen3-8B-Inst...
2026.05
12.5
81.2
87
FATE
Backbone=Ministral-3-8...
2026.05
14.1
79.7
85.8
FATE
Backbone=Llama-3.1-8B-...
2026.05
15.6
78.1
84.2
FATE
Backbone=Phi-4-reasoning
2026.05
16.4
78.1
83.6
FATE
Backbone=Gemma-3-12B-it
2026.05
17.2
76.6
82.1
Base
Backbone=Gemma-3-12B-it
2026.05
62.5
23.4
33.7
Base
Backbone=Ministral-3-8...
2026.05
64.1
21.9
31.4
Base
Backbone=Llama-3.1-8B-...
2026.05
67.2
18.8
28.6
Base
Backbone=Phi-4-reasoning
2026.05
68.8
20.3
30.1
Base
Backbone=Qwen3-8B-Inst...
2026.05
71.9
15.6
24.1
Feedback
Search any
task
Search any
task