Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Jailbreaking on Mistral-SU
Loading...
46
SRF (Mistral-SU)
Adaptive Probe-based Steering
-0.8
11.35
23.5
35.65
May 19, 2026
SRF (Mistral-SU)
Harmful Behavior Rate (Mistral-SU)
Success Rate (Mistral-SU)
Updated 13d ago
Evaluation Results
Method
Method
Links
SRF (Mistral-SU)
Harmful Behavior Rate (Mistral-SU)
Success Rate (Mistral-SU)
Adaptive Probe-based Steering
2026.05
46
57
77
RD-A
2026.05
8
4
15
RD-C
2026.05
4
3
8
Angular
2026.05
4
2
7
RepE
2026.05
2
1
3
SCAV
2026.05
1
1
0
Feedback
Search any
task
Search any
task