Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Robustness Evaluation on BiasBench

82.5Accuracy

Qwen2.5-32B-Instruct

64.369.02573.7578.475Jan 7, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
82.5
81.25
2026.01
81.25
2026.01
80
2026.01
77.5
2026.01
75
2026.01
67.5
65