Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Authorization on GPT judge traces n=500 5.4
Loading...
84.85
Benign Rate
GPT-5.4 judge
80.6075
82.72875
84.85
86.97125
May 18, 2026
Benign Rate
UAR (Unintended Action Rate)
ASR (Attack Success Rate)
Updated 14d ago
Evaluation Results
Method
Method
Links
Benign Rate
UAR (Unintended Action Rate)
ASR (Attack Success Rate)
GPT-5.4 judge
n=500
2026.05
84.85
99.25
99
Feedback
Search any
task
Search any
task