Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Open-form mathematical reasoning on DeepMind MATHEMATICS (Exact-match accuracy)
Loading...
42.13
Exact-match Accuracy
TRIM
6.9468
16.0809
25.215
34.3491
Oct 8, 2025
Exact-match Accuracy
Updated 19d ago
Evaluation Results
Method
Method
Links
Exact-match Accuracy
TRIM
Backbone=LLAMA-2-7B, C...
2025.10
42.13
Full-data Fine-tuning
Backbone=LLAMA-2-7B, C...
2025.10
42.1
S2L
Backbone=LLAMA-2-7B, C...
2025.10
41.92
LESS
Backbone=LLAMA-2-7B, C...
2025.10
41.8
TAGCOS
Backbone=LLAMA-2-7B, C...
2025.10
40.84
Random
Backbone=LLAMA-2-7B, C...
2025.10
38.75
Pretrained (no Fine-tuning)
Backbone=LLAMA-2-7B, C...
2025.10
8.3
Feedback
Search any
task
Search any
task