Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Personalized Story Generation Evaluation on PerMPST
Loading...
8.23
Score
EPER
7.6684
7.8142
7.96
8.1058
Sep 16, 2025
Score
Score Delta
95% CI Value
P(Delta > 0)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
Score Delta
95% CI Value
P(Delta > 0)
EPER
Backbone=Mistral-7B
2025.09
8.23
-
-
-
SR
Backbone=Mistral-7B
2025.09
8.14
0.29
0.1
100
EPIR
Backbone=Mistral-7B
2025.09
8.02
0.73
0.53
100
IPER
Backbone=Mistral-7B
2025.09
8.01
0.69
0.48
100
IPIR
Backbone=Mistral-7B
2025.09
7.88
0.94
0.74
100
PP
Backbone=Mistral-7B
2025.09
7.83
0.58
0.34
100
ZP
Backbone=Mistral-7B
2025.09
7.69
1.33
1.12
100
Feedback
Search any
task
Search any
task