Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Bias Evaluation on Race & Gender
Loading...
20
Mean Improvement
iPASwo
3.36
7.68
12
16.32
Sep 25, 2025
Mean Improvement
95% CI
P-Value
Updated 16d ago
Evaluation Results
Method
Method
Links
Mean Improvement
95% CI
P-Value
iPASwo
Model=Llama-3.1-8B-Ins...
2025.09
20
0.19
0
iPASwo
Model=Nous-Hermes-2-Mi...
2025.09
17
0.17
0
iPASa
Model=Llama-3.1-8B-Ins...
2025.09
17
0.16
0
iPASa
Model=Nous-Hermes-2-Mi...
2025.09
14
0.13
0
iPASwo
Model=DeepSeek-R1-Dist...
2025.09
14
0.13
0
PASf
Model=Llama-3.1-8B-Ins...
2025.09
11
0.1
0
iPASa
Model=DeepSeek-R1-Dist...
2025.09
10
0.09
0
PASf
Model=Nous-Hermes-2-Mi...
2025.09
7
0.06
0
PASf
Model=DeepSeek-R1-Dist...
2025.09
4
0.03
0
Feedback
Search any
task
Search any
task