Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Scientific Reasoning on MMLU-Sci
Loading...
54.96
Score
Galactica-30B
25.7776
33.3538
40.93
48.5062
Jan 15, 2024
Score
Updated 3d ago
Evaluation Results
Method
Method
Links
Score
Galactica-30B
Parameter Scale=30B~32B
2024.01
54.96
Llama3-8B-Instruct
Evaluation Protocol=fe...
2024.01
52.67
ChatGLM3-32B-Base
Parameter Scale=30B~32B
2024.01
50.3
SciGLM
Backbone=ChatGLM3-32B-...
2024.01
49.38
SciGLM
Backbone=ChatGLM3-6B-B...
2024.01
45.34
Mistral-7B: MetaMATH
Evaluation Protocol=fe...
2024.01
44.74
Mistral-7B: MetaMATH + SciInstruct
Fine-tuning=SciInstruc...
2024.01
42.16
ChatGLM3-6B
Parameter Scale=6B~7B
2024.01
41.78
Llama3-8B-Instruct + SciInstruct
Fine-tuning=SciInstruc...
2024.01
40.86
ChatGLM3-6B-Base
Parameter Scale=6B~7B
2024.01
40.16
ChatGLM2-6B-Base
Parameter Scale=6B~7B
2024.01
38.06
ChatGLM2-6B
Parameter Scale=6B~7B
2024.01
37.09
LLaMA-2-13B
Parameter Scale=12B~13B
2024.01
35.85
Vicuna-13B
Parameter Scale=12B~13B
2024.01
32.13
Galactica-6.7B
Parameter Scale=6B~7B
2024.01
30.68
LLaMA-2-7B
Parameter Scale=6B~7B
2024.01
30.41
Mistral-7B: MetaMATH
Evaluation Protocol=ze...
2024.01
28.25
Llama3-8B-Instruct
Evaluation Protocol=ze...
2024.01
26.9
Feedback
Search any
task
Search any
task