Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical and Quantitative Reasoning on Mathematic and Quantitative (test)
Loading...
51
Accuracy
MSFT
47.464
48.382
49.3
50.218
Mar 23, 2026
Accuracy
Epochs
Updated 2mo ago
Evaluation Results
Method
Method
Links
Accuracy
Epochs
MSFT
Size=Average, Evaluati...
2026.03
51
4.12
SFT
Size=Average, Evaluati...
2026.03
48
3.88
IES
Size=Average, Evaluati...
2026.03
48
3.88
DynamixSFT
Size=Average, Evaluati...
2026.03
47.9
4.04
Continual SFT
Size=Average, Evaluati...
2026.03
47.8
1.71
Base
Size=Average, Evaluati...
2026.03
47.6
-
Feedback
Search any
task
Search any
task