Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Commonsense Reasoning on CommonsenseQA (Accuracy)
Loading...
64.2
Accuracy
LightReasoner
23.8272
34.3086
44.79
55.2714
Oct 8, 2025
Nov 15, 2025
Dec 23, 2025
Jan 30, 2026
Mar 9, 2026
Apr 16, 2026
May 24, 2026
Accuracy
Updated 8d ago
Evaluation Results
Method
Method
Links
Accuracy
LightReasoner
2025.10
64.2
Baseline
2025.10
62.6
NITP
Model scale=9bA1b, Eva...
2026.05
49.96
Full-data Fine-tuning
Backbone=LLAMA-3.2-1B,...
2025.10
48.24
NTP
Model scale=9bA1b, Eva...
2026.05
45.7
TRIM
Backbone=LLAMA-3.2-1B,...
2025.10
40.76
LESS
Backbone=LLAMA-3.2-1B,...
2025.10
39.1
BM25
Backbone=LLAMA-3.2-1B,...
2025.10
38.88
NITP
Model scale=3bA0.5b, E...
2026.05
37.92
DSIR
Backbone=LLAMA-3.2-1B,...
2025.10
37.16
RDS
Backbone=LLAMA-3.2-1B,...
2025.10
36.16
TAGCOS
Backbone=LLAMA-3.2-1B,...
2025.10
34.72
NTP
Model scale=3bA0.5b, E...
2026.05
34.15
S2L
Backbone=LLAMA-3.2-1B,...
2025.10
34.1
Random
Backbone=LLAMA-3.2-1B,...
2025.10
34.05
CLD
Backbone=LLAMA-3.2-1B,...
2025.10
33.1
Pretrained (no Fine-tuning)
Backbone=LLAMA-3.2-1B,...
2025.10
29.32
NITP
Model scale=1.9bA0.3b,...
2026.05
26.61
NTP
Model scale=1.9bA0.3b,...
2026.05
25.38
Feedback
Search any
task
Search any
task