Share your thoughts, 1 month free Claude Pro on usSee more

Systemic Risk Detection on Safe-OS

100Normal Count

AgentMonitor

Updated 4mo ago

Evaluation Results

Method	Links
AgentMonitor 2025.02		100	46.7	39.1	85
LLaMA-Guard 3 2025.02		100	55.2	100	100
AGrail 2025.02		95.6	3.8	0	5
AGrail 2025.02		95.6	4	0	10
ToolEmu 2025.02		57.7	4.2	100	35
GPT-4o 2025.02		52.4	7.7	61.9	15
Claude-3.5 2025.02		50	0	14.3	20