Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning on Overall

93.51Accuracy

ROMCSC

35.73850.736565.73580.7335Mar 23, 2026Apr 2, 2026Apr 13, 2026Apr 23, 2026May 4, 2026May 14, 2026May 25, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2026.03
93.519471,15915.5712.37
2026.03
93.471,5351,8437.576.26
2026.03
93.071,0491,31310.088.1
2026.03
92.771,2821,8427.865.79
2026.03
92.531,1881,42111.169.14
2026.03
91.721,6362,1977.045.6
2026.03
91.299121,74310.155.9
2026.03
84.955131,68016.585.83
2026.05
55.87----
2026.05
54.22----
2026.05
52.6----
2026.05
52.09----
2026.05
51.77----
2026.05
51.69----
2026.05
51.68----
2026.05
50.51----
2026.05
50.46----
2026.05
49.37----
2026.05
49.24----
2026.05
49.18----
2026.05
47.95----
2026.05
47.22----
2026.05
41.34----
2026.05
37.96----