Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Spec-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Speculative DecodingSpec-Bench
MT Score195.6
48
Inference AccelerationSpec-Bench
MAT Score4.62
39
Text GenerationSpec-Bench Overall
SD Score2.33
21
TranslationSpec-Bench Trans.
CR6.41
21
SummarizationSpec-Bench Sum.
CR Score4.73
21
Retrieval-Augmented GenerationSpec-Bench RAG
CR5.48
21
Question AnsweringSpec-Bench QA
CR4.54
21
Multi-turn DialogueSpec-Bench Multi.
CR3.22
21
Mathematical ReasoningSpec-Bench Math
CR4.09
21
Language Model DecodingSpec-Bench
Conv. Acc267.6
11
Speculative Decoding ThroughputSpec-Bench
Throughput (Conv.)519.7
10
Speculative DecodingSpec-Bench OLMo 2 7B
Conversation Score5.12
5
Speculative Decoding ThroughputSpec-Bench (test)
Throughput (Conv.)-
0
Showing 13 of 13 rows