Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Role-Playing on Role-playing human assessment set (test)
Loading...
602
Win Count
Baichuan2-7B+PCL
248.4
340.2
432
523.8
Mar 22, 2025
Win Count
Tie Count
Fail Count
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Count
Tie Count
Fail Count
Baichuan2-7B+PCL
Backbone=Baichuan2-7B,...
2025.03
602
92
306
Qwen-7B+PCL
Backbone=Qwen-7B, Asse...
2025.03
582
181
237
Qwen-7B+PCL
Comparison Baseline=Qw...
2025.03
303
137
60
Baichuan2-7B+PCL
Comparison Baseline=Ba...
2025.03
262
43
195
Feedback
Search any
task
Search any
task