Share your thoughts, 1 month free Claude Pro on usSee more

Advanced Reasoning on T-Math

63.4Accuracy

o4-mini (medium)

Updated 4mo ago

Evaluation Results

Method	Links
o4-mini (medium) 2025.12		63.4
DeepSeek-R1 2025.12		61.9
T-pro 2.0 2025.12		54.1
Qwen3-32B 2025.12		52.9
RuadaptQwen3-32B-Instruct 2025.12		44.4
DeepSeek-V3 2025.12		27.8
DeepSeek-R1-Distill-Qwen-32B 2025.12		25.4
Gemma 3 27B 2025.12		20.8
GigaChat 2 Max 2025.12		14.2
YandexGPT5-Pro 2025.12		13
GPT-4o 2025.12		10.6