Olmo

Benchmarks

Task Name	Dataset Name	SOTA Result
Membership Inference Attack	OLMo-7B	TPR@1%FPR55.3	16
Membership Inference Attack	OLMo near-IID Dolma 3 (test)	AUC0.723	13
Model Editing	OLMo-7B memorized sequences	Success Rate (%)87.9	11
Downstream Policy Evaluation	OLMo3 Adapt	GSM8K96	10
Membership Inference Attack	OLMO initial checkpoint	AUC0.54	8
Verified Dependency Edge Recovery	Olmo public artifacts 3	Verified Dependency Edges481	7
Training Data Attribution	Olmo-7B	Tail-patch (%)98.6	5
Pretraining Data Mixture Estimation	OLMo Pretraining Mixture (Temporal held-out)	Web Source Estimate95.5	3
Language Model Training Performance	OLMo-1B 24k sequence length	Training Time (ms)6,161	2
General Language Evaluation	OLMo-2 Held-out Evals	AGIEval Score24.4	2
Question Answering	OLMo Benchmarks 2 (dev)	NQ Score16.1	2
Language Modeling	OLMo (val)	Base CE2.24	1

Showing 12 of 12 rows