Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fairness Evaluation on CEB-Jigsaw
Loading...
73.5
Score
Self-Debias Iter1 + Self-Correction
18.484
32.767
47.05
61.333
Apr 9, 2026
Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Score
Self-Debias Iter1 + Self-Correction
2026.04
73.5
Self-Debias Iter1
2026.04
73.1
Qwen2.5-7B-Instruct
2026.04
72.7
Self-Debias Offline + Self-Correction
2026.04
72.5
Qwen1.5-8B
2026.04
72.4
Self-Debias Iter2
2026.04
72.1
Self-Debias Iter2 + Self-Correction
2026.04
71.9
Self-Debias Offline
2026.04
71.7
Self-Debias SFT + Self-Correction
2026.04
70.7
Self-Debias SFT
2026.04
70.5
Qwen2.5-7B-Instruct + Self-Correction
2026.04
68.3
Llama-3.1-8B-Instruct
2026.04
67.3
DeepSeek-R1-Distill-Qwen-7B
2026.04
65.8
DeepSeek-R1-Distill-Qwen-7B + Self-Correction
2026.04
45.1
Llama-3.1-8B-Instruct + Self-Correction
2026.04
29.1
Qwen1.5-8B + Self-Correction
2026.04
20.6
Feedback
Search any
task
Search any
task