Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Red-teaming on Target Victim Model

2Unknown/Unsafe Attacks

SFT

-3.2832.3668103.64May 1, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.05
20.23
2026.05
391.7
436.75
2026.05
5.330.52
2026.05
17.6793.75
2026.05
212.54
3366.11
2026.05
75.337.36
2026.05
13492.55