| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text Generation | NoveltyBench | Diversity10 | 81 | |
| Patience-discounted reward evaluation | NoveltyBench | Utility4.096 | 36 | |
| Output Diversity | NOVELTYBENCH | Distinct Score52.42 | 31 | |
| Instruction Following | NoveltyBench | Lexical Dominance40.1 | 7 | |
| Novelty Evaluation | NoveltyBench | Overall Dominance44 | 5 | |
| Diversity Measurement | NoveltyBench curated | D_Can Mean48.1 | 4 | |
| Human Evaluation | NoveltyBench | Quality4.04 | 2 |