Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fine-grained Safety Control on CoSApien Allowed instructions
Loading...
88
Accuracy (GD)
PALETTE
44.32
55.66
67
78.34
May 22, 2026
Accuracy (GD)
Accuracy (AB)
Accuracy (PP)
Average Utility
Updated 8d ago
Evaluation Results
Method
Method
Links
Accuracy (GD)
Accuracy (AB)
Accuracy (PP)
Average Utility
PALETTE
Base Model=LLaMA2-7B-Chat
2026.05
88
100
88.2
0.353
CAST
Base Model=LLaMA2-7B-Chat
2026.05
82
82.2
84.3
0.326
AutoDAN
Base Model=LLaMA2-7B-Chat
2026.05
48
100
68.6
0.168
Base
Base Model=LLaMA2-7B-Chat
2026.05
46
100
66.7
0.365
Feedback
Search any
task
Search any
task