Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Bias and Sentiment Evaluation on BOLD
Loading...
50.3
BOLD Score
CODE LLAMA - INSTRUCT 7B
16.292
25.121
33.95
42.779
Aug 24, 2023
BOLD Score
Updated 4d ago
Evaluation Results
Method
Method
Links
BOLD Score
CODE LLAMA - INSTRUCT 7B
Alignment=Instruct (al...
2023.08
50.3
LLAMA 2 CHAT 7B
Alignment=Instruct (al...
2023.08
48.2
LLAMA 2 CHAT 13B
Alignment=Instruct (al...
2023.08
47.1
LLAMA 2 CHAT 34B
Alignment=Instruct (al...
2023.08
46.1
CODE LLAMA - INSTRUCT 34B
Alignment=Instruct (al...
2023.08
45.2
CODE LLAMA - INSTRUCT 13B
Alignment=Instruct (al...
2023.08
36.5
Falcon-instruct 7B
Alignment=Instruct (al...
2023.08
33.2
LLAMA 2 13B
Alignment=Pretrained,...
2023.08
33
MPT 7B
Alignment=Pretrained,...
2023.08
32.2
LLAMA 2 34B
Alignment=Pretrained,...
2023.08
31.8
StarCoder (Python) 15.5B
Alignment=Pretrained,...
2023.08
31
LLAMA 2 7B
Alignment=Pretrained,...
2023.08
30.4
MPT-instruct 7B
Alignment=Instruct (al...
2023.08
30.2
Falcon 7B
Alignment=Pretrained,...
2023.08
28.3
CODE LLAMA 34B
Alignment=Pretrained,...
2023.08
25.5
CODE LLAMA 7B
Alignment=Pretrained,...
2023.08
23
CODE LLAMA 13B
Alignment=Pretrained,...
2023.08
17.6
Feedback
Search any
task
Search any
task