Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning on Overall (Efficiency Metrics)

84.8Accuracy

DTSR

49.64858.77467.977.026Apr 6, 2026Apr 7, 2026Apr 8, 2026
Updated 9d ago

Evaluation Results

MethodLinks
2026.04
84.83,748--
2026.04
84.75,638--
2026.04
84.64,010--
2026.04
84.45,761--
2026.04
83.37,5321007,532
2026.04
83.34,285--
2026.04
83.24,584--
2026.04
83.14,318--
2026.04
834,367--
2026.04
81.96,510--
2026.04
81.57,09991.37,394
2026.04
814,428--
2026.04
79.64,60555.34,684
2026.04
79.34,532--
2026.04
79.14,832--
2026.04
78.56,08682.76,293
2026.04
78.48,3441008,344
2026.04
784,25650.14,330
2026.04
77.88,22196.98,565
2026.04
77.46,176706,294
2026.04
77.15,85965.45,956
2026.04
73.66,54778.56,777
2026.04
73.61,302--
2026.04
73.25,7291005,729
2026.04
724,73680.84,830
2026.04
71.85,53696.25,776
2026.04
71.75,02889.59,223
2026.04
71.61,123--
2026.04
71.44,51675.44,607
2026.04
70.55,20291.15,417
2026.04
70.51,458--
2026.04
69.86,59575.311,668
2026.04
667,5141007,514
2026.04
667,5181007,802
2026.04
64.25,85573.26,292
2026.04
63.85,54568.35,935
2026.04
62.56,43686.36,824
2026.04
53.71,73629.92,399
2026.04
51.12,05120.33,191
2026.04
512,73433.14,170