Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Jailbreaking on R2D2

31SRF

Adaptive Probe-based Steering

0.848.6716.524.33May 19, 2026
Updated 13d ago

Evaluation Results

MethodLinks
314164
2026.05
253550
2026.05
233145
2026.05
182539
2026.05
440
2026.05
251