Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Jailbreak Robustness on StrongREJECT
Loading...
95.39
Safe Response Rate
Qwen (SFT_DB)
49.318
61.279
73.24
85.201
May 27, 2025
Safe Response Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Safe Response Rate
Qwen (SFT_DB)
LLM=Qwen, SFT Status=S...
2025.05
95.39
DPO_Whisperer
Alignment=DPO, Data Re...
2025.05
94.91
Mixtral (SFT_DB)
LLM=Mixtral, SFT Statu...
2025.05
94.04
SFT_DB
Training=Supervised Fi...
2025.05
94.04
Qwen (Base)
LLM=Qwen, SFT Status=None
2025.05
72.84
Mixtral (SFT_OG)
LLM=Mixtral, SFT Statu...
2025.05
67.01
Qwen (SFT_OG)
LLM=Qwen, SFT Status=S...
2025.05
59.48
Mixtral (Base)
LLM=Mixtral, SFT Statu...
2025.05
51.09
Feedback
Search any
task
Search any
task