Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Commonsense Reasoning on SocialIQA

88.1Accuracy

HUMAN

35.912849.461463.0176.5586Mar 24, 2021Jan 15, 2022Nov 8, 2022Sep 1, 2023Jun 24, 2024Apr 17, 2025Feb 9, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2021.03
88.1--
2024.07
83.5--
2021.03
83.2--
2024.07
80.8--
2024.07
80.5--
2026.01
80.19--
2021.03
80--
79.8--
2024.07
79.2--
2026.01
79.02--
2024.07
78.8--
2024.05
77.2--
76.7--
2024.05
76.2--
2024.05
76--
2024.05
75.8--
2024.05
75.8--
2024.07
75.4--
2024.05
75.2--
2024.07
75.1--
2026.01
74.87--
2024.05
74.8--
2024.07
74.6--
2024.05
74--
2024.05
74--
2024.07
74--
2024.07
73.5--
2024.07
73.4--
2024.05
73.2--
2024.05
73--
2024.07
71.2--
2024.07
69.6--
2024.07
69.3--
2023.06
67.3--
2023.06
66.2--
2023.06
66--
2023.06
65.7--
2023.06
65.5--
2023.06
65.4--
2023.06
65.3--
2023.06
65.1--
2023.06
65.1--
2023.06
64.8--
2023.06
64.3--
2024.05
63.7--
2024.05
63.5--
2023.06
62.7--
2024.05
60.8--
2023.06
60.2--
2024.05
56.8--
2024.05
56.5--
2024.05
55.9--
2024.05
55.8--
2024.05
55.7--
2024.05
55.3--
2026.02
55.1--
2026.02
54.7--
2026.02
54.6--
2026.02
53.8--
2024.05
53.5--
2024.05
53.1--
2026.02
52.9--
2026.02
52.9--
2025.12
50.6--
2026.02
50.1--
2024.05
49.1--
2025.12
48.9--
2024.05
48.5--
2025.12
48.2--
2025.12
45.2--
2025.12
44.8--
2025.12
44.8--
2025.08
44.37--
2024.05
44.3--
2025.08
44.11--
2025.08
43.91--
2025.08
43.76--
2025.08
42.84--
2025.08
42.53--
2025.08
42.12--
2025.12
41.8--
2025.12
41.7--
2026.02
41.61--
2025.08
41.56--
2024.05
41.1--
2024.05
40.3--
2024.05
40.1--
2025.08
39.82--
2024.05
39.7--
2024.05
39.3--
2026.02
39.2--
2024.05
39.1--
2025.08
39--
2025.08
39--
2024.05
38.9--
2024.05
38.9--
2025.08
37.92--
2025.07
-61.261.2
2025.07
-62.567.4
2025.07
-52.864.9
Showing 100 of 112 rows