Share your thoughts, 1 month free Claude Pro on usSee more

Safety Evaluation on MultiJail Italian

100Safe Response Rate

Direct Instruction

Updated 1mo ago

Evaluation Results

Method	Links
Direct Instruction 2025.09		100
AIM 2025.09		100
SteerMoE 2025.09		90.2
SteerMoE + AIM 2025.09		11.4