Share your thoughts, 1 month free Claude Pro on usSee more

General Evaluation on LiveBench

46.83Accuracy

phi-balancing

Updated 2mo ago

Evaluation Results

Method	Links
phi-balancing 2026.05		46.83
ST-MoE 2026.05		42.21
Frozen checkpoint 2026.05		19.57
phi-balancing 2026.05		19.42
phi-balancing 2026.05		19.34
ST-MoE 2026.05		18.8
phi-balancing 2026.05		17.85
ST-MoE 2026.05		16.85
ST-MoE 2026.05		14.62
phi-balancing 2026.05		14.15
Frozen checkpoint 2026.05		13.93
Frozen checkpoint 2026.05		13.93
ST-MoE 2026.05		13.79
Frozen checkpoint 2026.05		5.92
Frozen checkpoint 2026.05		5.92