Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Safety Risk Detection on internal Agentic AI workflow benchmark

100Precision

ShieldGemma-9B

-3.500823.369650.2477.1104Dec 23, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
100350
2025.12
1005100
2025.12
9828431
2025.12
9877862
2025.12
9867802
2025.12
9723371
2025.12
9746622
2025.12
9625402
2025.12
9515261
2025.12
8623365
2025.12
79213332
2025.12
65847364
2025.12
61867183
2025.12
60756771
2025.12
0.940.680.790.05
2025.12
0.920.670.770.06
2025.12
0.910.710.80.07
2025.12
0.910.870.890.09
2025.12
0.880.520.650.07
2025.12
0.870.860.860.13
2025.12
0.870.890.880.11
2025.12
0.870.870.870.12
2025.12
0.850.90.870.15
2025.12
0.850.860.850.14
2025.12
0.840.880.860.16
2025.12
0.830.860.840.17
2025.12
0.830.940.880.18
2025.12
0.80.860.830.21
2025.12
0.480.020.040.02