Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on AQUA-RAT

91.73Accuracy

Q-Opt + P-Opt

33.874848.894963.91578.9351May 24, 2022Jan 21, 2023Sep 20, 2023May 19, 2024Jan 16, 2025Sep 15, 2025May 15, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2026.03
91.73-
2026.03
90.34-
2026.03
89.67-
2026.03
89.15-
2026.03
88.23-
2026.03
87.86-
2026.05
87.7-
2026.05
87.6-
2026.05
87.6-
2026.03
87.4-
2026.05
87.3-
2026.03
86.78-
2026.03
86.61-
2026.05
86.6-
2026.03
85.92-
2026.05
85.5-
2026.05
85.5-
2026.05
85.5-
2026.05
84.6-
2026.05
84.5-
2026.05
84.3-
2026.05
84-
2026.05
83.7-
2026.05
83.5-
2026.05
83-
2026.05
83-
2026.05
82.9-
2026.05
82.6-
2026.05
82.6-
2024.03
70.9-
2026.05
69.9-
2023.06
69.7-
2024.03
66.9-
2023.06
66.8-
2024.03
66.1-
2023.06
65-
2024.03
60.6-
2023.06
60.2-
2024.03
59.4-
2026.01
59.06-
2026.01
59.06-
2024.03
58.7-
2024.03
58.6-
2026.01
58.4-
2026.05
57.74-
2026.05
57.58-
2024.03
57.5-
2026.05
57.13-
2026.05
56.82-
2023.06
56.5-
2026.05
56.46-
2026.05
55.91-
2024.03
55.9-
2026.04
55.511.38
2024.03
55.5-
2026.01
55.12-
2023.06
55.1-
2024.03
54.724-
2024.03
54.724-
2026.01
54.72-
2026.05
54.46-
2024.03
54.331-
2024.03
54.1-
2026.04
53.540.59
2026.04
53.152.15
2024.03
52.756-
2026.05
52.62-
2024.03
52.362-
2024.03
52-
2024.03
52-
2026.01
51.97-
2026.01
51.97-
2026.04
49.215.49
2024.03
48.4-
2022.05
48.3-
2026.04
47.242.36
2026.04
46.851.38
2026.04
46.852.35
2022.05
46.5-
2026.04
46.062.96
2026.01
44.88-
2026.04
44.885.49
2026.04
44.091.18
2024.03
43.9-
2026.01
42.13-
2026.04
42.132.16
2026.01
41.34-
2026.04
41.342.55
2026.01
40.94-
2026.01
40.94-
2024.03
40.2-
2026.01
40.16-
2026.01
38.58-
2026.01
37.93-
37.9-
2023.06
37.8-
2026.01
37.4-
2024.03
37.4-
2023.06
36.5-
2022.05
36.1-
Showing 100 of 153 rows