Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Creative Writing on Creative Writing Human Evaluation
Loading...
75
Human Preference Count
Min-k
24.04
37.27
50.5
63.73
Apr 13, 2026
Human Preference Count
Percentage Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Human Preference Count
Percentage Score
Min-k
Model Backbone=Total
2026.04
75
37.5
Top-nσ
Model Backbone=Total
2026.04
67
33.5
Tie
Model Backbone=Total
2026.04
58
29
Min-k
Model Backbone=LLaMA3-...
2026.04
41
-
Top-nσ
Model Backbone=Qwen3-4...
2026.04
34
-
Min-k
Model Backbone=Qwen3-4...
2026.04
34
-
Top-nσ
Model Backbone=LLaMA3-...
2026.04
33
-
Tie
Model Backbone=Qwen3-4...
2026.04
32
-
Tie
Model Backbone=LLaMA3-...
2026.04
26
-
Feedback
Search any
task
Search any
task