Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Instruction Adherence and Security Robustness on StruQ 1.0 (Adversarial)
Loading...
85.46
Capability Score
Text
78.9704
80.6552
82.34
84.0248
Dec 3, 2025
Capability Score
ASR (Naive)
ASR (Ignore)
ASR (Escape-S)
ASR (Completion-R)
ASR (Average)
ASR (Worst)
Updated 4d ago
Evaluation Results
Method
Method
Links
Capability Score
ASR (Naive)
ASR (Ignore)
ASR (Escape-S)
ASR (Completion-R)
ASR (Average)
ASR (Worst)
Text
Training Type=Vanilla...
2025.12
85.46
3.37
0.96
2.88
98.08
26.32
98.08
CAHL
2025.12
83.79
1.44
1.44
1.44
2.4
1.68
2.4
Delimiter
Training Type=Special...
2025.12
83.29
2.88
0.96
2.88
54.81
15.38
54.81
ISE
2025.12
79.22
1.44
0.96
1.44
8.65
3.13
8.65
Feedback
Search any
task
Search any
task