Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Summarization on SAMSum 30 samples (test)
Loading...
4.52
Faithfulness
ChatGPT
3.1888
3.5344
3.88
4.2256
Oct 17, 2023
Faithfulness
Fluency
Informativeness
Conciseness
Updated 1mo ago
Evaluation Results
Method
Method
Links
Faithfulness
Fluency
Informativeness
Conciseness
ChatGPT
Evaluator=Human Annotator
2023.10
4.52
4.38
4.62
2.77
Human-written
Evaluator=Human Annotator
2023.10
4.34
4.54
3.58
4.36
InstructDS
Evaluator=Human Annotator
2023.10
4.13
4.35
3.54
4.23
Flan-UL2
Evaluator=Human Annotator
2023.10
4
4.38
3.03
4.29
BART
Evaluator=Human Annotator
2023.10
3.85
4.36
3.22
4.3
Alpaca
Evaluator=Human Annotator
2023.10
3.24
3.77
3.45
3.11
Feedback
Search any
task
Search any
task