Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Jailbreak Mitigation on GCG
Loading...
0
GCG ASR
Goal
-2.128
12.236
26.6
40.964
Sep 30, 2025
Nov 2, 2025
Dec 5, 2025
Jan 7, 2026
Feb 9, 2026
Mar 14, 2026
Apr 17, 2026
GCG ASR
OR-Bench Toxic Score
OR-Bench Hard Score
MMLU Accuracy
R-Score
Overall Mitigation Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
GCG ASR
OR-Bench Toxic Score
OR-Bench Hard Score
MMLU Accuracy
R-Score
Overall Mitigation Score
Goal
Model=Mistral-7B
2026.04
0
-
-
-
-
-
ASGUARD
2025.09
1
96.7
59.5
68.3
76
45
SFT
Data Mixture=30/70
2025.09
2
66.1
2.27
67.3
13.8
13.4
RepBend
2025.09
5
94.3
50.4
68.1
73.3
41.7
CB
Model=Mistral-7B
2026.04
14
-
-
-
-
-
Llama-3.1-8B-Instruct
Backbone=Llama-3.1, Pa...
2025.09
15
88.5
28.9
68.2
-
-
Beam
Model=Mistral-7B
2026.04
17.5
-
-
-
-
-
Base
Model=Mistral-7B
2026.04
53.2
-
-
-
-
-
Feedback
Search any
task
Search any
task