| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Membership Inference Attack | OLMo near-IID Dolma 3 (test) | AUC0.723 | 13 | |
| Downstream Policy Evaluation | OLMo3 Adapt | GSM8K96 | 10 | |
| Membership Inference Attack | OLMO initial checkpoint | AUC0.54 | 8 | |
| Training Data Attribution | Olmo-7B | Tail-patch (%)98.6 | 5 | |
| Pretraining Data Mixture Estimation | OLMo Pretraining Mixture (Temporal held-out) | Web Source Estimate95.5 | 3 | |
| Language Model Training Performance | OLMo-1B 24k sequence length | Training Time (ms)6,161 | 2 | |
| General Language Evaluation | OLMo-2 Held-out Evals | AGIEval Score24.4 | 2 | |
| Question Answering | OLMo Benchmarks 2 (dev) | NQ Score16.1 | 2 | |
| Language Modeling | OLMo (val) | Base CE2.24 | 1 |