Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Instruction Following Evaluation on BPO Eval (test)
Loading...
58.5
A Win Rate
BPO
51.22
53.11
55
56.89
Nov 7, 2023
A Win Rate
B Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
A Win Rate
B Win Rate
BPO
Base LLM=gpt-3.5-turbo...
2023.11
58.5
41.5
BPO
Base LLM=text-bison, M...
2023.11
53
47
BPO
Base LLM=claude-instan...
2023.11
52.5
47.5
BPO
Base LLM=claude-2, Met...
2023.11
52
48
BPO
Base LLM=gpt-4, Method...
2023.11
51.5
48.5
Feedback
Search any
task
Search any
task