Share your thoughts, 1 month free Claude Pro on usSee more

Knowledge-based reasoning on MMLU High School History 1.0 (test)

92.83Accuracy

QwQ-32B (Full COT)

Updated 5mo ago

Evaluation Results

Method	Links
QwQ-32B (Full COT) 2025.08		92.83	1,703.9
QwQ-32B (80%) 2025.08		92.83	1,683.9
QwQ-32B (90%) 2025.08		92.83	1,683.9
QwQ-32B (No Thinking) 2025.08		91.14	-
DeepSeek-R1-7B (80%) 2025.08		64.32	1,907.5
DeepSeek-R1-7B 2025.08		61.74	2,054.3
DeepSeek-R1-7B (90%) 2025.08		61.74	1,936.2
DeepSeek-R1-7B 2025.08		47.82	-