Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Language Evaluation on OLMo-2 Held-out Evals
Loading...
24.4
AGIEval Score
OLMo-2-0425-1B
22.112
22.706
23.3
23.894
Sep 27, 2025
AGIEval Score
GSM8K Score
MMLU Pro Score
TQA Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
AGIEval Score
GSM8K Score
MMLU Pro Score
TQA Score
OLMo-2-0425-1B
Training Tokens=4T, Ev...
2025.09
24.4
3.4
11.1
50
OLMo-2-1B
Training Tokens=210B,...
2025.09
22.2
2.4
10.9
41.4
Feedback
Search any
task
Search any
task