Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Safety Evaluation on Do-Not-Answer (test)
Loading...
3.195
ASR
Full Precision
3.00332
4.29716
5.591
6.88484
Jan 17, 2026
ASR
Updated 4d ago
Evaluation Results
Method
Method
Links
ASR
Full Precision
Backbone=Qwen-2.5-7B-I...
2026.01
3.195
AWQ-trust
Backbone=Qwen-2.5-7B-I...
2026.01
3.301
AWQ
Backbone=Qwen-2.5-7B-I...
2026.01
3.585
AWQ-trust
Backbone=Llama-3.1-8B-...
2026.01
4.26
Full Precision
Backbone=Gemma-7B-Inst...
2026.01
5.431
AWQ
Backbone=Llama-3.1-8B-...
2026.01
5.964
Full Precision
Backbone=Llama-3.1-8B-...
2026.01
6.07
AWQ-trust
Backbone=Gemma-7B-Inst...
2026.01
6.567
AWQ
Backbone=Gemma-7B-Inst...
2026.01
7.987
Feedback
Search any
task
Search any
task