Share your thoughts, 1 month free Claude Pro on usSee more

Aggregate Language and Logic Tasks on HumanEval++, MATH, MMLU-Redux

94.61Average Accuracy

HieraMAS

Updated 3mo ago

Evaluation Results

Method	Links
HieraMAS 2026.02		94.61
Full-Graph 2026.02		92.78
AFlow 2026.02		92.69
Self-Consistency+CoT 2026.02		91.6
LLM-Debate 2026.02		91.45
MASRouter 2026.02		90.89
Random-Graph 2026.02		90.85
GDesigner 2026.02		90.68
CoT 2026.02		89.81
Self-Consistency 2026.02		89.77
Base 2026.02		83.14