Share your thoughts, 1 month free Claude Pro on usSee more

General Multi-domain Reasoning on AIME, MMLU-Pro, MedMCQA, and GPQA

62.43Average Score

SKILL-MOE

Updated 1mo ago

Evaluation Results

Method	Links
SKILL-MOE 2025.03		62.43
Self-Consistency (SC) 2025.03		58.72
Zero-Shot CoT 2025.03		56.94
Zero-Shot CoT 2025.03		54.28
Self-MoA 2025.03		54.28
ReConcile 2025.03		53.8
Multi-Agent Debate 2025.03		53.76
MoA 2025.03		53.76
Zero-Shot CoT 2025.03		53.62
Zero-Shot CoT 2025.03		53.18
Self-Refine (SR) 2025.03		51.87
Zero-Shot CoT 2025.03		48.04