Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Response Generation on Open Assistant 953 prompts (test)
Loading...
1,294
Elo Rating
GPT-4
842.64
959.82
1,077
1,194.18
May 23, 2023
Elo Rating
Rank
Median Rank
Updated 4d ago
Evaluation Results
Method
Method
Links
Elo Rating
Rank
Median Rank
GPT-4
Judge=GPT-4, # Prompts...
2023.05
1,294
1
1
ChatGPT-3.5 Turbo
Judge=GPT-4, # Prompts...
2023.05
1,015
2
5
Guanaco-65B
Judge=GPT-4, # Prompts...
2023.05
1,008
3
2
Guanaco-33B
Judge=GPT-4, # Prompts...
2023.05
1,002
4
4
Vicuna-13B
Judge=GPT-4, # Prompts...
2023.05
936
5
5
Guanaco-13B
Judge=GPT-4, # Prompts...
2023.05
885
6
6
Guanaco-7B
Judge=GPT-4, # Prompts...
2023.05
860
7
7
Feedback
Search any
task
Search any
task