Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Evaluation on AlpacaEval
Loading...
51.06
AlpacaE
GLM-4-Voice
8.3576
19.4438
30.53
41.6162
Mar 10, 2026
AlpacaE
Updated 1mo ago
Evaluation Results
Method
Method
Links
AlpacaE
GLM-4-Voice
Backbone=GLM-4-Voice,...
2026.03
51.06
SPAR-K
Backbone=GLM-4-Voice,...
2026.03
50.7
GLM-4-Voice
Backbone=GLM-4-Voice,...
2026.03
50.2
SPAR-K
Backbone=GLM-4-Voice,...
2026.03
49.75
SPAR-K
Backbone=GLM-4-Voice,...
2026.03
49.45
SPAR-K
Backbone=GLM-4-Voice,...
2026.03
48.64
SPAR-K
Backbone=GLM-4-Voice,...
2026.03
48.19
SPAR-K
Backbone=GLM-4-Voice,...
2026.03
47.84
SPAR-K
Backbone=Step-Audio-2,...
2026.03
46.33
Step-Audio-2
Backbone=Step-Audio-2,...
2026.03
45.98
SPAR-K
Backbone=Step-Audio-2,...
2026.03
45.98
SPAR-K
Backbone=Step-Audio-2,...
2026.03
45.28
GLM-4-Voice
Backbone=GLM-4-Voice,...
2026.03
44.17
Step-Audio-2
Backbone=Step-Audio-2,...
2026.03
35.58
Step-Audio-2
Backbone=Step-Audio-2,...
2026.03
24.92
SPAR-K
Backbone=GLM-4-Voice,...
2026.03
10
Feedback
Search any
task
Search any
task