Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Out-of-distribution (OOD) Harmful Content Detection on AdvBench
Loading...
99.2
AUROC (vs Alpaca)
w_opt
94.24
96.72
99.2
101.68
Apr 20, 2026
AUROC (vs Alpaca)
AUROC (vs XSTest)
Min AUROC (OOD)
Updated 1mo ago
Evaluation Results
Method
Method
Links
AUROC (vs Alpaca)
AUROC (vs XSTest)
Min AUROC (OOD)
w_opt
strategy=Optimised dis...
2026.04
99.2
99.8
99.2
Feedback
Search any
task
Search any
task