Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Open-ended Generation on NoveltyBench, WildChat, and Narrative-Discourse Average (test)
Loading...
49
Lexical Dominance
Aligned
-1.648
11.501
24.65
37.799
Nov 7, 2025
Lexical Dominance
Lexical Coverage
Semantic Coverage
Semantic Dominance
Overall Coverage
Overall Dominance
Updated 22h ago
Evaluation Results
Method
Method
Links
Lexical Dominance
Lexical Coverage
Semantic Coverage
Semantic Dominance
Overall Coverage
Overall Dominance
Aligned
2025.11
49
26.9
10.4
29.2
18.6
39
BACO
Selection Strategy=Best
2025.11
24.9
44.5
36
40.5
40.3
32.7
Base
2025.11
12.7
9.8
9.8
16
9.8
14.3
Nudging
2025.11
9.3
27.6
24.7
9.9
26.1
9.6
Prompting
Selection Strategy=Best
2025.11
2.7
-
-
2.2
-
2.4
Ensemble
Selection Strategy=Best
2025.11
1.1
-
-
1.9
-
1.5
Decoding
2025.11
0.3
-
-
0.3
-
0.3
Feedback
Search any
task
Search any
task