Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Safety assessment on StrongReject
Loading...
0.179
Personalization Bias (PB)
Identity-Robust Generation
0.12964
0.46282
0.796
1.12918
Jan 14, 2026
Personalization Bias (PB)
Updated 4d ago
Evaluation Results
Method
Method
Links
Personalization Bias (PB)
Identity-Robust Generation
Model=Llama3.3-70B
2026.01
0.179
Identity-Robust Generation
Model=Qwen3
2026.01
0.201
Prompt Steering
Model=Qwen3
2026.01
0.282
Identity-Robust Generation
Model=gpt-oss-20B
2026.01
0.325
Vanilla Generation
Model=Llama3.3-70B
2026.01
0.384
Vanilla Generation
Model=Qwen3
2026.01
0.408
Prompt Steering
Model=Llama3.3-70B
2026.01
0.444
Vanilla Generation
Model=gpt-oss-20B
2026.01
0.831
Prompt Steering
Model=gpt-oss-20B
2026.01
1.413
Feedback
Search any
task
Search any
task