Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Text Summarization on Summary (LLM-as-judge)
Loading...
44.4
LLM-as-judge Score
ProbMoE
0.408
11.829
23.25
34.671
Jun 1, 2026
LLM-as-judge Score
Updated 18h ago
Evaluation Results
Method
Method
Links
LLM-as-judge Score
ProbMoE
Backbone=Qwen1.5-MoE-A...
2026.06
44.4
DenseMixer
Backbone=Qwen1.5-MoE-A...
2026.06
41
ProbMoE
Backbone=OLMoE-1B-7B,...
2026.06
39.29
Conventional
Backbone=Qwen1.5-MoE-A...
2026.06
39
DefaultMoE
Backbone=Qwen1.5-MoE-A...
2026.06
38.4
Frozen Router
Backbone=Qwen1.5-MoE-A...
2026.06
38.29
DenseMixer
Backbone=OLMoE-1B-7B,...
2026.06
37.5
Frozen Router
Backbone=OLMoE-1B-7B,...
2026.06
36.05
Conventional
Backbone=OLMoE-1B-7B,...
2026.06
33.7
Base Model
Backbone=Qwen1.5-MoE-A...
2026.06
28.29
ReMoE
Backbone=Qwen1.5-MoE-A...
2026.06
25.8
Base Model
Backbone=OLMoE-1B-7B,...
2026.06
7.4
SparseMixer
Backbone=Qwen1.5-MoE-A...
2026.06
2.1
Feedback
Search any
task
Search any
task