Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Summary Similarity Evaluation on GPT generated summaries 5.1
Loading...
86.5
BERTScore-F1
AutoMUP
85.356
85.653
85.95
86.247
Apr 8, 2026
BERTScore-F1
SBERT Similarity
SimCSE Similarity
USE Similarity
ROUGE-L
BLEURT Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
BERTScore-F1
SBERT Similarity
SimCSE Similarity
USE Similarity
ROUGE-L
BLEURT Score
AutoMUP
Consensus level=A1
2026.04
86.5
65.5
96.8
65.1
18.2
38.3
AutoMUP
Consensus level=A2
2026.04
85.8
58.5
96.8
60
14.2
29
AutoMUP
Consensus level=A3
2026.04
85.4
58.5
96.7
57.8
13.3
25.9
Feedback
Search any
task
Search any
task