Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on TabMWP (accuracy)

97.61Accuracy

Zero-shot-EI

64.423673.039381.65590.2707Feb 15, 2024Jun 17, 2024Oct 19, 2024Feb 20, 2025Jun 23, 2025Oct 25, 2025Feb 26, 2026
Updated 17d ago

Evaluation Results

MethodLinks
2026.02
97.61
2026.02
97.43
2025.08
97.2
2026.02
97.1
2025.08
96.9
2026.02
96.73
2026.02
96.15
2026.02
96.12
2024.05
95.9
2024.02
95.9
2026.02
95.67
2026.02
95.62
2026.02
95.12
2026.02
95.1
2026.02
94.76
2026.02
94.17
2025.06
93.9
2025.06
93.9
2026.02
93.88
2025.08
93.2
2025.06
92.9
2026.02
91.23
2024.05
90.8
2026.02
90.55
2025.08
89.9
2025.06
89.2
2026.02
86.21
2026.02
84.71
2024.05
84.7
2025.08
84
2024.05
82
2025.09
81.8
2024.05
80.5
2024.05
79.9
2024.05
79.9
2024.05
79.2
2025.08
78.4
2025.09
77
2024.05
75.6
2026.02
75.18
2024.05
75.1
2024.05
74.8
2026.01
74.3
2024.02
74.2
2026.01
74.2
2024.02
74
2026.01
73.9
2026.01
73.9
2026.01
73.6
2026.01
72.8
2026.01
71.5
2024.02
70.8
2024.05
70.5
2024.02
70.5
2026.01
70.4
2026.01
70.4
2026.01
70.4
2026.01
70.4
2026.01
70.3
2026.01
70.3
2026.01
70.3
2026.01
70.3
2026.01
70.2
2024.05
70.1
2024.05
70
2024.02
70
2026.01
70
2026.01
69.9
2026.01
69.9
2026.01
69.9
2026.01
69.9
2024.05
69.8
2026.01
69.7
2026.01
69.6
2026.01
69.6
2026.01
69.5
2025.09
69.5
2026.01
69.1
2026.01
68.6
2026.01
68.6
2026.01
68.5
2026.01
68.3
2024.05
67.5
2024.05
67.3
2026.01
67.2
2026.01
67.2
2026.01
67
2026.01
66.7
2026.01
66.6
2026.01
66.4
2026.01
66.4
2026.01
66.3
2026.01
66.3
2026.01
66.3
2026.01
66.3
2026.01
66.1
2024.02
66
2026.02
65.93
2026.01
65.7
2026.01
65.7
Showing 100 of 188 rows