Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Roleplay Personalization on Roleplay Human Evaluation (50 users, 11 questions)
Loading...
72.3
Winrate
FSPO
68.036
69.143
70.25
71.357
Feb 26, 2025
Winrate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Winrate
FSPO
Baseline Method=SFT
2025.02
72.3
FSPO
Baseline Method=Base
2025.02
68.2
Feedback
Search any
task
Search any
task