Share your thoughts, 1 month free Claude Pro on usSee more

Jailbreak Robustness on safe-unlearning

98Avg Evaluation Score (k=4)

Self-ReSET

Updated 2mo ago

Evaluation Results

Method	Links
Self-ReSET 2026.05		98
Self-ReSET 2026.05		97.3
STAR-1 2026.05		95.9
Self-ReSET 2026.05		95.6
RECAP 2026.05		93.7
DAPO 2026.05		89.8
DAPO 2026.05		88.2
RECAP 2026.05		87.9
RECAP 2026.05		84.9
STAR-1 2026.05		81.8
DAPO 2026.05		81.3
STAR-1 2026.05		79.9
Safechain 2026.05		71.7
Safechain 2026.05		68.6
Safechain 2026.05		61.8
Base 2026.05		49.7
Base 2026.05		48.4
Base 2026.05		45.9