Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Knowledge-based reasoning on MMLU College Medicine 1.0 (test)

86.13Accuracy

QwQ-32B (Full COT)

51.113260.204169.29578.3859Aug 5, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.08
86.132,912.3
2025.08
84.972,475.9
2025.08
84.972,326.4
2025.08
84.3-
2025.08
62.342,127.9
2025.08
62.342,069.4
2025.08
61.732,612.7
2025.08
52.46-