Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

General Knowledge and Reasoning on BBH, MMLU, CMMLU, and C-Eval Suite

59.48BBH

Qwen3-Inst

24.920833.892942.86551.8371Nov 28, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.11
59.4863.0560.8462.7
2025.11
57.2561.3464.0662.86
2025.11
42.5645.8741.6443.81
2025.11
26.2542.2443.3345.41