Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Disambiguation and completeness on AmbigQA
Loading...
0.113
Personalization Bias
Identity-Robust Generation
0.09172
0.23536
0.379
0.52264
Jan 14, 2026
Personalization Bias
Updated 4d ago
Evaluation Results
Method
Method
Links
Personalization Bias
Identity-Robust Generation
Model=Llama3.3-70B
2026.01
0.113
Identity-Robust Generation
Model=Qwen3
2026.01
0.131
Identity-Robust Generation
Model=gpt-oss-20B
2026.01
0.171
Prompt Steering
Model=Llama3.3-70B
2026.01
0.337
Vanilla Generation
Model=Llama3.3-70B
2026.01
0.402
Prompt Steering
Model=gpt-oss-20B
2026.01
0.467
Vanilla Generation
Model=Qwen3
2026.01
0.47
Prompt Steering
Model=Qwen3
2026.01
0.557
Vanilla Generation
Model=gpt-oss-20B
2026.01
0.645
Feedback
Search any
task
Search any
task