Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Role-playing on RoleBench Chinese (instruction generalization)
Loading...
36.4
Win Rate (vs GPT-4)
RoleGLM
23.712
27.006
30.3
33.594
Oct 1, 2023
Win Rate (vs GPT-4)
Win Rate (vs Human)
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Rate (vs GPT-4)
Win Rate (vs Human)
RoleGLM
2023.10
36.4
52.4
ChatPLUG
2023.10
28.9
19.9
Character.AI
2023.10
28.2
19
ChatGLM2
2023.10
24.2
19.6
Feedback
Search any
task
Search any
task