Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Factual Knowledge on MMLU-Pro (test)

24.5Accuracy

Alpaca-GPT4 + NAIT (GSM)

21.307222.136122.96523.7939Mar 13, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
24.537.74.65
2026.03
23.6137.183.2
2026.03
23.4637.062.88
2026.03
23.3636.471.23
2026.03
23.2937.23.24
2026.03
23.0436.170.39
2026.03
22.8637.163.15
2026.03
22.8236.711.89
2026.03
21.9635.18-2.34
2026.03
21.8936.03-
2026.03
21.535.68-0.98
2026.03
21.4335.69-0.94