Share your thoughts, 1 month free Claude Pro on usSee more

Knowledge-based reasoning on MMLU College Medicine 1.0 (test)

86.13Accuracy

QwQ-32B (Full COT)

Updated 5mo ago

Evaluation Results

Method	Links
QwQ-32B (Full COT) 2025.08		86.13	2,912.3
QwQ-32B (80%) 2025.08		84.97	2,475.9
QwQ-32B (90%) 2025.08		84.97	2,326.4
QwQ-32B (No Thinking) 2025.08		84.3	-
DeepSeek-R1-7B (80%) 2025.08		62.34	2,127.9
DeepSeek-R1-7B (90%) 2025.08		62.34	2,069.4
DeepSeek-R1-7B 2025.08		61.73	2,612.7
DeepSeek-R1-7B 2025.08		52.46	-