Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on Math Evaluation Suite
Loading...
28.1
Math Score
Sparse upcycling
7.3
12.7
18.1
23.5
Mar 12, 2024
Math Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Math Score
Sparse upcycling
Training Context=Data-...
2024.03
28.1
LLEMMA
Model Size=7B
2024.03
28
BTX
Active Experts (Top-k)=2
2024.03
27.4
BTX
Active Experts (Top-k)...
2024.03
26.4
BTM
Active Experts (Top-k)=2
2024.03
21.5
BTM
Active Experts (Top-k)=1
2024.03
21.3
Dense
Training Context=Data-...
2024.03
18.3
LLAMA-2
Model Size=13B
2024.03
16.3
LLAMA-2
Model Size=7B
2024.03
8.6
CODELLAMA
Model Size=7B
2024.03
8.1
Feedback
Search any
task
Search any
task