Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
OOD Detection on ToxicChat (test)
Loading...
60.2
Length-Matched AUROC
Best Single
50.528
53.039
55.55
58.061
Apr 30, 2026
Length-Matched AUROC
Updated 1mo ago
Evaluation Results
Method
Method
Links
Length-Matched AUROC
Best Single
category=Trajectory, b...
2026.04
60.2
Maha-Traj
category=Trajectory, b...
2026.04
58
RAUQ
category=Attn-Based, b...
2026.04
53
CED
category=Attn-Based, b...
2026.04
50.9
Attn Ent.
category=Attn-Based, b...
2026.04
50.9
Feedback
Search any
task
Search any
task