Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Jailbreaking on MaliciousInstruct
Loading...
4.8
BERT Score
LINT
1.0664
2.0357
3.005
3.9743
May 17, 2025
BERT Score
Harmful Score
Harmful Info Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
BERT Score
Harmful Score
Harmful Info Score
LINT
API Compatibility=false
2025.05
4.8
2.36
0.4
FLIP
API Compatibility=true
2025.05
4.42
2.03
0.46
Naive
API Compatibility=true
2025.05
4.07
1.54
0.17
JULI
Target Model=Llama2-7B...
2025.05
4.01
3.84
3.19
GCG
API Compatibility=false
2025.05
3.63
1.74
0.37
ED
API Compatibility=false
2025.05
3.55
3.66
2.74
JULI
API Compatibility=false
2025.05
3.39
2.52
1.72
JULI (API)
API Compatibility=true
2025.05
2.86
1.95
0.71
WTS
API Compatibility=false
2025.05
2.04
1.33
0.21
Base
API Compatibility=true
2025.05
1.98
1.16
0.04
Naive
Target Model=Llama2-7B...
2025.05
1.21
1
0
Feedback
Search any
task
Search any
task