Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Explainability classification on WildGuardMix human-annotated (test)

60.69F1 Score

LEG base

54.637256.208657.7859.3514Jan 24, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
60.69
2026.01
58.39
2026.01
54.87