Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Bias Evaluation on BBQ Gender
Loading...
47.2
Ambiguity Score
KLAAD
33.68
37.19
40.7
44.21
Mar 19, 2026
Ambiguity Score
Accuracy
Updated 29d ago
Evaluation Results
Method
Method
Links
Ambiguity Score
Accuracy
KLAAD
Backbone=LLaMA-3-8B
2026.03
47.2
36.9
CDA
Backbone=LLaMA-3-8B
2026.03
41.9
38.1
ORIGINAL
Backbone=LLaMA-3-8B
2026.03
37.2
38.8
UGID
Backbone=LLaMA-3-8B
2026.03
34.2
39.8
Feedback
Search any
task
Search any
task