Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Safety Classification on ToxicChat (out-of-distribution)

72.88F1 Score

Multi-head self-attn

46.068853.029459.9966.9506Jan 19, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
72.880.798
2026.01
70.8-
2026.01
70-
2026.01
68.3-
2026.01
67.7-
2026.01
64.810.706
2026.01
64.3-
2026.01
61.60.626
2026.01
61.40.631
2026.01
53.330.565
2026.01
47.1-