Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Personalized Response Generation on Assistant and Summary personalization tasks (test)
Loading...
83.91
Win Rate
vol-mo
13.3772
31.6886
50
68.3114
Nov 1, 2024
Win Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Win Rate
vol-mo
Opponent=vol-un
2024.11
83.91
vol-mo
Opponent=rnd-un
2024.11
82.35
vol-mo
Opponent=rnd-mo
2024.11
75.7
rnd-mo
Opponent=vol-un
2024.11
60.62
rnd-mo
Opponent=rnd-un
2024.11
58.99
rnd-un
Opponent=vol-un
2024.11
52.44
vol-un
Opponent=rnd-un
2024.11
47.56
rnd-un
Opponent=rnd-mo
2024.11
41.01
vol-un
Opponent=rnd-mo
2024.11
39.38
rnd-mo
Opponent=vol-mo
2024.11
24.3
rnd-un
Opponent=vol-mo
2024.11
17.65
vol-un
Opponent=vol-mo
2024.11
16.09
Feedback
Search any
task
Search any
task