| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Language Modeling | TinyStories (val) | Last Loss1.1284 | 21 | |
| Narrative Generation | TinyStories 21 (test) | Speedup (x)1.62 | 15 | |
| Text Continuation | TinyStories 1K random samples (test) | R-139.7 | 10 | |
| Language Modeling | TinyStories 60M tokens (val) | PPL (Val)51.23 | 8 | |
| Language Modeling | TinyStories 10k (val) | Validation Loss (nats/token)1.1284 | 7 | |
| Narrative Video Generation | TinyStories | Image Quality76.93 | 7 | |
| Topical Text Steering | TinyStories | Average Target Score31.5 | 6 | |
| Scaling-law extrapolation | TinyStories high-D holdout | RMSE (log space)0.053 | 6 | |
| Scaling-law extrapolation | TinyStories high-C holdout | RMSE (log space)0.095 | 6 | |
| Language Modeling Evaluation | TinyStories | Grammar6.63 | 5 | |
| Story Generation Evaluation | TinyStories GPT-4.1 Nano | Grammar6.47 | 5 | |
| Story Generation | TinyStories | Grammar Score6.37 | 5 | |
| Language Generation | TinyStories (test) | Grammar9.93 | 5 | |
| Token recovery | TinyStories | Mean Queries2 | 2 | |
| Lineage Verification | TinyStories | p-value0 | 2 | |
| Fingerprint persistence | TinyStories cleaned V2 | T-Test Statistic0 | 2 | |
| Model Fingerprint Verification | TinyStories (test) | t-test p-value0 | 2 | |
| Lineage Verification | TinyStories Continual seed 123 (train) | t-test (logits)0.434 | 1 | |
| Lineage Verification | TinyStories seed 1000 Continual (train) | t-test p-value (logits)5.36 | 1 |