Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-agent Reasoning on MMLU

91.02Accuracy

Single Best

87.036888.070989.10590.1391Oct 1, 2025
Updated 14d ago

Evaluation Results

MethodLinks
2025.10
91.02
2025.10
90.37
2025.10
90.37
2025.10
90.01
2025.10
89.32
2025.10
88.64
2025.10
88.49
2025.10
87.92
2025.10
87.19