Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Jailbreaking on Llama3-DeRTA

61Success Rate First (SRF)

Adaptive Probe-based Steering

-2.4414.0330.546.97May 19, 2026
Updated 13d ago

Evaluation Results

MethodLinks
617891
2026.05
324448
2026.05
253746
2026.05
71315
2026.05
140
2026.05
010