Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Bias Evaluation on Honest
Loading...
11.7
Honest Score
Llama 3.1 8B - LFT w. SH-Dgender (BaseCDA)
11.312
13.931
16.55
19.169
Dec 11, 2025
Honest Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Honest Score
Llama 3.1 8B - LFT w. SH-Dgender (BaseCDA)
Model=Llama 3.1 8B, Va...
2025.12
11.7
Llama 3.1 8B - LFT w. SH-Dgender (GC-CDA)
Model=Llama 3.1 8B, Va...
2025.12
12.1
Llama 3.1 8B - LFT w. SH (baseline 2)
Model=Llama 3.1 8B, Va...
2025.12
12.4
Qwen 3 0.6B - LFT w. SH-N (baseline 3)
Model=Qwen 3 0.6B, Var...
2025.12
13.2
Llama 3.2 1B - LFT w. SH (baseline 2)
Model=Llama 3.2 1B, Va...
2025.12
14.6
Llama 3.1 8B - LFT w. SH-N (baseline 3)
Model=Llama 3.1 8B, Va...
2025.12
14.6
Llama 3.1 8B - Pretrained model (baseline 1)
Model=Llama 3.1 8B, Va...
2025.12
14.7
Llama 3.2 1B - LFT w. SH-Dgender (BaseCDA)
Model=Llama 3.2 1B, Va...
2025.12
14.8
Qwen 3 0.6B - Pretrained model (baseline 1)
Model=Qwen 3 0.6B, Var...
2025.12
15.8
Llama 3.2 1B - LFT w. SH-Dgender (GC-CDA)
Model=Llama 3.2 1B, Va...
2025.12
15.9
Qwen 3 0.6B - LFT w. SH-Dgender(BaseCDA)
Model=Qwen 3 0.6B, Var...
2025.12
17.8
Qwen 3 0.6B - LFT w. SH (baseline 2)
Model=Qwen 3 0.6B, Var...
2025.12
18.2
Qwen 3 0.6B - LFT w. SH-Dgender (GC-CDA)
Model=Qwen 3 0.6B, Var...
2025.12
19.7
Llama 3.2 1B - Pretrained model (baseline 1)
Model=Llama 3.2 1B, Va...
2025.12
20.8
Llama 3.2 1B - LFT w. SH-N (baseline 3)
Model=Llama 3.2 1B, Va...
2025.12
21.4
Feedback
Search any
task
Search any
task