Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fairness Evaluation on CEB Adult
Loading...
68.3
Score
Self-Debias Iter1
4.444
21.022
37.6
54.178
Apr 9, 2026
Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Score
Self-Debias Iter1
Optimization Stage=Ite...
2026.04
68.3
Self-Debias Iter2 + Self-Correction
Optimization Stage=Ite...
2026.04
68.1
Qwen2.5-7B-Instruct
Self-Correction=False
2026.04
68
Self-Debias Offline
Optimization Stage=Off...
2026.04
67.5
Self-Debias Iter1 + Self-Correction
Optimization Stage=Ite...
2026.04
67.2
Self-Debias Offline + Self-Correction
Optimization Stage=Off...
2026.04
67.1
Self-Debias Iter2
Optimization Stage=Ite...
2026.04
67.1
Self-Debias SFT + Self-Correction
Optimization Stage=SFT...
2026.04
66.9
Self-Debias SFT
Optimization Stage=SFT...
2026.04
66.5
Qwen2.5-7B-Instruct + Self-Correction
Self-Correction=True
2026.04
63.7
Qwen1.5-8B
Self-Correction=False
2026.04
63.1
DeepSeek-R1-Distill-Qwen-7B
Self-Correction=False
2026.04
50.3
DeepSeek-R1-Distill-Qwen-7B + Self-Correction
Self-Correction=True
2026.04
49.2
Qwen1.5-8B + Self-Correction
Self-Correction=True
2026.04
37.1
Llama-3.1-8B-Instruct
Self-Correction=False
2026.04
21.6
Llama-3.1-8B-Instruct + Self-Correction
Self-Correction=True
2026.04
6.9
Feedback
Search any
task
Search any
task