Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Utility assessment on MMLU-Pro
Loading...
16.5
Personalization Bias (PB)
Identity-Robust Generation
-0.772
115.814
232.4
348.986
Jan 14, 2026
Personalization Bias (PB)
Updated 4d ago
Evaluation Results
Method
Method
Links
Personalization Bias (PB)
Identity-Robust Generation
Model=Llama3.3-70B
2026.01
16.5
Identity-Robust Generation
Model=Qwen3
2026.01
19.6
Identity-Robust Generation
Model=gpt-oss-20B
2026.01
25.2
Vanilla Generation
Model=Llama3.3-70B
2026.01
35
Prompt Steering
Model=Llama3.3-70B
2026.01
36.6
Prompt Steering
Model=Qwen3
2026.01
49
Vanilla Generation
Model=Qwen3
2026.01
52.3
Prompt Steering
Model=gpt-oss-20B
2026.01
60.8
Vanilla Generation
Model=gpt-oss-20B
2026.01
448.3
Feedback
Search any
task
Search any
task