Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

QMSum

Benchmarks

Task NameDataset NameSOTA ResultTrend
Query-based meeting summarizationQMSum (test)
ROUGE-219.63
34
Next Token PredictionQMSum
Next Token Accuracy47
32
Long-context language generationQMSum
Average Acceptance Length (τ)3.15
25
SummarizationQMSum (val)
ROUGE-L0.2378
17
Traceback (Prompt Injection Attacks)QMSum
Precision99
13
Abstractive SummarizationQMSum
BLEU6.75
11
Context TracebackQMSum LongBench
Precision99
10
Synthetic Text GenerationQMSum
Mean Embedding Similarity52
10
Document SummarizationQMSum (test)
ROUGE-138.9
10
Document SummarizationQMSum
G-mean15.47
9
SummarizationQMSum
Std Dev ROUGE-10.3
8
Query-focused SummarizationQMSum (test)
ROUGE-138.06
7
Payload-splitting attack detectionQMSum
Precision (QMSum)81
6
Query-based Meeting SummarizationQMSum
ROUGE-L10
6
Query-focused Meeting SummarizationQMSum 50 samples
Fluency4.88
6
SummarizationQMSum (test)
Fluency4.93
5
Next Token PredictionQMSum
Acc (BERT-Small, Epsilon=Inf)32.82
4
Abstractive Meeting SummarizationQMSum
Coreference1.67
4
Meeting SummarizationQMSum (all turns)
ROUGE-134.03
4
Long-Context SummarizationQMSum
ROUGE-L15.32
3
Meeting SummarizationQMSum Gold turns only
ROUGE-140.2
3
Query-based SummarizationQMSum SCROLLS (val)
ROUGE-130.9
2
Transcript Challenge AssessmentQMSum (test)
Spoken Language Score3
1
Meeting Transcript EvaluationQMSum
Coherence4.5
1
Showing 24 of 24 rows