Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Instruction Generalization on RoleBench instruction generalization
Loading...
55.8
GPT-4 Win Rate
RoleLLaMA-7B
1.72
15.76
29.8
43.84
Oct 1, 2023
GPT-4 Win Rate
Human Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
GPT-4 Win Rate
Human Win Rate
RoleLLaMA-7B
Backbone=LLaMA-7B
2023.10
55.8
52
Vicuna
2023.10
32
23.4
Character.AI
2023.10
31.4
30.2
Alpaca
2023.10
16
20
ChatPLUG
2023.10
3.8
16.4
Feedback
Search any
task
Search any
task