Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-label Safety Categorization on HarmBench prompts

0.5432Macro Accuracy

Opir-multitask-large

0.1561120.2566060.35710.457594May 28, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.05
0.5432
2026.05
0.4828
0.2986
2026.05
0.171