Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Systemic Risk Detection on EIA
Loading...
94
Action Grounding (Grd)
LLaMA-Guard 3
2.48
26.24
50
73.76
Feb 17, 2025
Action Grounding (Grd)
Action Generation (Gen)
Normal Score (Norm)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Action Grounding (Grd)
Action Generation (Gen)
Normal Score (Norm)
LLaMA-Guard 3
Framework Category=Gua...
2025.02
94
90
100
AgentMonitor
Framework Category=Gua...
2025.02
58
40
100
GPT-4o
Framework Category=Mod...
2025.02
42
16
66.7
Claude-3.5
Framework Category=Mod...
2025.02
40
28
56.7
AGrail
Framework Category=Gua...
2025.02
8
26
76.7
AGrail
Framework Category=Gua...
2025.02
6
28
86.7
Feedback
Search any
task
Search any
task