Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Evaluation on LiveBench
Loading...
46.83
Accuracy
phi-balancing
4.2836
15.3293
26.375
37.4207
May 14, 2026
Accuracy
Updated 16d ago
Evaluation Results
Method
Method
Links
Accuracy
phi-balancing
Model=Moonlight-16B-A3...
2026.05
46.83
ST-MoE
Model=Moonlight-16B-A3...
2026.05
42.21
Frozen checkpoint
Model=Moonlight-16B-A3...
2026.05
19.57
phi-balancing
Model=DeepSeek-V2-Lite
2026.05
19.42
phi-balancing
Model=DeepSeek-V2-Lite
2026.05
19.34
ST-MoE
Model=DeepSeek-V2-Lite
2026.05
18.8
phi-balancing
Model=DeepSeek-MoE-Chat
2026.05
17.85
ST-MoE
Model=DeepSeek-V2-Lite
2026.05
16.85
ST-MoE
Model=DeepSeek-MoE-Chat
2026.05
14.62
phi-balancing
Model=DeepSeek-MoE-Chat
2026.05
14.15
Frozen checkpoint
Model=DeepSeek-V2-Lite
2026.05
13.93
Frozen checkpoint
Model=DeepSeek-V2-Lite
2026.05
13.93
ST-MoE
Model=DeepSeek-MoE-Chat
2026.05
13.79
Frozen checkpoint
Model=DeepSeek-MoE-Chat
2026.05
5.92
Frozen checkpoint
Model=DeepSeek-MoE-Chat
2026.05
5.92
Feedback
Search any
task
Search any
task