Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PolicyGuardBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Violation DetectionPolicyGuardBench
Safety F188.83
30
Prefix-based violation detectionPOLICYGUARDBENCH
Accuracy (N=1)93.89
16
Policy-trajectory compliance evaluationPOLICYGUARDBENCH
F1 Score88.83
10
Showing 3 of 3 rows