Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Commonsense Reasoning on CommonSenseQA

91.2Accuracy

Fine-tuning

56.318465.374274.4383.4858Jun 6, 2022Jan 11, 2023Aug 18, 2023Mar 25, 2024Oct 30, 2024Jun 6, 2025Jan 12, 2026
Updated 2d ago

Evaluation Results

MethodLinks
91.2--
2026.01
83.6--
2026.01
82.8--
2022.06
82.5--
2022.06
81.6--
2026.01
81--
2026.01
80.8--
2022.06
79.9--
2023.06
79.5--
2023.06
79.3--
2024.07
79.28--
2022.06
79.2--
2022.06
79--
2023.06
78.9--
2026.01
78.4--
2023.06
78.2--
2023.06
77.8--
2026.01
77.6--
2022.06
77.3--
2023.06
77.3--
2023.06
77.1--
2023.06
76.5--
2023.06
76--
2023.05
75.92--
2025.12
75.7--
2023.05
75.59--
2025.12
75.4--
2023.06
75.4--
2023.06
75.4--
2025.12
75.3--
2025.12
75.3--
2026.01
75.2--
2025.12
75.2--
2025.12
75.2--
2025.12
75.1--
2025.12
75.1--
2025.12
75.1--
2023.05
75.02--
2022.06
75--
2025.12
75--
2025.12
75--
2025.12
74.9--
2026.01
74.7--
2025.12
74.7--
2025.12
74.7--
2025.12
74.5--
2025.12
74.4--
2023.06
74.4--
2025.12
74.3--
2025.12
74.3--
2025.12
73.8--
2025.12
73.8--
2023.05
73.55--
2023.06
73.5--
2022.06
73.4--
2026.01
73.4--
2025.12
73.4--
2022.06
73.3--
2022.06
72.9--
2025.12
72.9--
2025.12
72.8--
2026.01
72.7--
2025.12
72.4--
2026.01
72.1--
2026.01
71.8--
2025.12
71--
2026.01
70.3--
2025.12
70.3--
2025.12
70.1--
2026.01
69.62--
2026.01
68.8--
2023.06
68.8--
2026.01
68.63--
2022.06
67.8--
2024.07
66.83--
2026.01
66.5--
2024.07
65.27--
2024.07
65.11--
2026.01
65--
2024.07
64.78--
2024.07
64.7--
2023.06
64.6--
2026.01
64.54--
2026.01
64.53--
2026.01
64.29--
2026.01
63.06--
2026.01
63.06--
2026.01
62.24--
2022.10
62.2--
2026.01
62--
2022.10
61.6--
2026.01
61.43--
2022.10
61.2--
2026.01
60.85--
2026.01
60.2--
2026.01
59.3--
2026.01
58.97--
2026.01
58.25--
2022.06
57.9--
2026.01
57.66--
Showing 100 of 134 rows