Share your thoughts, 1 month free Claude Pro on usSee more

Home/Benchmarks

Fairness-sensitive reasoning on UnQover

99.9Accuracy

C2PO

Updated 5mo ago

Evaluation Results

Method	Links
C2PO 2025.12		99.9
GPT-4 2025.12		88.9

SOTA Paper

C2PO

C2PO: Diagnosing and Disentangling Bias Shortcuts in LLMs

Dataset

UNQOVER

Follow for update

@wizwand_team Discord

Related Benchmarks

General Utility Evaluation on MT_Bench Stereotypical Bias Mitigation on Unqover Structural Bias Evaluation on MNLI Structural Bias Evaluation on HANS General Utility Evaluation on Chatbot

© 2026 wizwand

Blog Contact Changelog Swarm

Privacy Policy Terms of Service FAQs Swarm Docs