"I've Seen How This Goes": Characterizing Diversity via Progressive Conditional Surprise
About
Measuring the diversity of creative outputs is central to evaluating post-training mode collapse, comparing decoding strategies, and quantifying creative behavior in both AI and human writing. We propose a new approach to measuring diversity using in-context learning, of which the ``Decan'' metric, $D_{Ca_n} = C \times a_n$, is the working instance we evaluate: a per-byte score read off the per-token log-probabilities of a base model $\theta$ in a \emph{single forward pass} per permutation, with no embedding model, no reference corpus, and no human labels. This approach is grounded in information theory, makes use of language model in-context learning to detect a wide range of similarities between any number of inputs, and obviates the need to train a special-purpose model. The same pipeline scores AI samples and human-written response sets, with diversity treated as a property of (responses, prompt, scoring model). On Tevet and Berant's human-grounded McDiv benchmark, $D_{Ca_n}$ reaches OCA 0.846 on the McDiv prompt\_gen set where it performs best, behind the strongest neural baseline reported in Tevet and Berant (SentBERT, 0.897). On the OLMo-2-7B post-training pipeline, $D_{Ca_n}$ drops monotonically across the base $\to$ SFT $\to$ DPO $\to$ RLVR stages, detecting the type of diversity loss that creative-writing applications care about.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| prompt_gen | ConTest 200 with_hds | Spearman Rho0.686 | 12 | |
| Prompt Generation | DecTest prompt_gen 1000 samples no_hds | Spearman Rho0.932 | 7 | |
| Response Generation | DecTest resp_gen no_hds (1000 samples) | Spearman ρ0.924 | 7 | |
| Story Generation | DecTest story_gen no_hds (1000 samples) | Spearman ρ0.779 | 7 | |
| prompt_gen | McDiv nuggets ~1K, no_hds | Spearman Rho0.636 | 6 | |
| prompt_gen | McDiv full no_hds ~2K | Spearman Rho0.729 | 6 | |
| resp_gen | ConTest 200 with_hds | Spearman Correlation0.391 | 6 | |
| resp_gen | McDiv nuggets ~1K no hds | Spearman rho0.345 | 6 | |
| story_gen | McDiv_nuggets ~1K no_hds | Spearman rho0.317 | 6 | |
| resp_gen | McDiv full no_hds ~2K | Spearman rho0.5 | 6 |