Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Systemic Risk Detection on Safe-OS
Loading...
100
Normal Count
AgentMonitor
48
61.5
75
88.5
Feb 17, 2025
Normal Count
System Sabotage Count
Prompt Injection Count
Env Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Normal Count
System Sabotage Count
Prompt Injection Count
Env Score
AgentMonitor
Framework Category=Gua...
2025.02
100
46.7
39.1
85
LLaMA-Guard 3
Framework Category=Gua...
2025.02
100
55.2
100
100
AGrail
Framework Category=Gua...
2025.02
95.6
3.8
0
5
AGrail
Framework Category=Gua...
2025.02
95.6
4
0
10
ToolEmu
Framework Category=Gua...
2025.02
57.7
4.2
100
35
GPT-4o
Framework Category=Mod...
2025.02
52.4
7.7
61.9
15
Claude-3.5
Framework Category=Mod...
2025.02
50
0
14.3
20
Feedback
Search any
task
Search any
task