Share your thoughts, 1 month free Claude Pro on usSee more

Science Reasoning on GPQA (Score)

81.4Score

DeepSeek v3.2

Updated 4mo ago

Evaluation Results

Method	Links
DeepSeek v3.2 2025.12		81.4
GLM-4.6 2025.12		78.8
DeepSeek R1 0528 2025.12		77.5
GPT-OSS 2025.12		77.3
GLM-4.5 2025.12		77
INTELLECT-3 2025.12		74.4
GLM-4.5-Air 2025.12		73.3
TPO 2025.05		39
GRPO 2025.05		38
CPO 2025.05		35.5
KTO 2025.05		35
TI-DPO 2025.05		34.5
DPO 2025.05		34
TDPO 2025.05		34
SIMPO 2025.05		33.5
SFT 2025.05		33
IPO 2025.05		31