Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical reasoning and calculation on TPS-CalcBench (test)

90.2KPI

gpt-5.2

10.036830.848451.6672.4716Apr 20, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
90.22.438
2026.04
87.85--
2026.04
84.14.944
2026.04
83.65.852
2026.04
79.47.661
2026.04
79.22--
2026.04
77.8--
2026.04
71.81--
2026.04
57.315.271
2026.04
42.15--
2026.04
28.715.668
2026.04
13.12--