Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Commonsense Reasoning on MC-TACO
Loading...
66.14
Exact Match (EM)
MDM-Prime-v2
10.2504
24.7602
39.27
53.7798
Oct 5, 2022
May 2, 2023
Nov 28, 2023
Jun 25, 2024
Jan 21, 2025
Aug 19, 2025
Mar 17, 2026
Exact Match (EM)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Exact Match (EM)
MDM-Prime-v2
Zero-shot=true, Number...
2026.03
66.14
Pythia
Zero-shot=true, Number...
2026.03
54.25
Bloom
Zero-shot=true, Number...
2026.03
53.63
GPT-Neo
Zero-shot=true, Number...
2026.03
42.89
TinyLLaMA
Zero-shot=true, Number...
2026.03
40.88
OPT
Zero-shot=true, Number...
2026.03
37.08
SMDM
Zero-shot=true, Number...
2026.03
35.07
GLM-130B
shots=0, parameters=130B
2022.10
13.6
BLOOM 176B
shots=0, parameters=176B
2022.10
13.1
OPT 175B
shots=0, parameters=175B
2022.10
12.4
Feedback
Search any
task
Search any
task