Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fairness Evaluation on CrowS-Pairs
Loading...
72.2
Score
Self-Debias Iter2 + Self-Correction
50.152
55.876
61.6
67.324
Apr 9, 2026
Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Score
Self-Debias Iter2 + Self-Correction
2026.04
72.2
Self-Debias Iter2
2026.04
71.2
Self-Debias Iter1 + Self-Correction
2026.04
70.2
Self-Debias Iter1
2026.04
70
Qwen1.5-8B
2026.04
68.8
Qwen1.5-8B + Self-Correction
2026.04
68.8
Self-Debias Offline + Self-Correction
2026.04
68.5
Self-Debias SFT
2026.04
68.2
Self-Debias Offline
2026.04
67.8
Self-Debias SFT + Self-Correction
2026.04
67.5
Qwen2.5-7B-Instruct
2026.04
66.5
DeepSeek-R1-Distill-Qwen-7B
2026.04
59.2
Qwen2.5-7B-Instruct + Self-Correction
2026.04
59.2
DeepSeek-R1-Distill-Qwen-7B + Self-Correction
2026.04
58.5
Llama-3.1-8B-Instruct
2026.04
54.2
Llama-3.1-8B-Instruct + Self-Correction
2026.04
51
Feedback
Search any
task
Search any
task