Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Summary-level Evaluation on Narrative dataset Summary-level QA
Loading...
59.3
ROUGE-1
LENS
28.204
36.277
44.35
52.423
Dec 28, 2025
ROUGE-1
ROUGE-2
ROUGE-L
BLEU-4
METEOR
BERTScore
Coverage
Alignment
Updated 3mo ago
Evaluation Results
Method
Method
Links
ROUGE-1
ROUGE-2
ROUGE-L
BLEU-4
METEOR
BERTScore
Coverage
Alignment
LENS
Model=Qwen2.5-14B-Inst...
2025.12
59.3
26.5
41
22
46.7
77.6
75.3
60.1
TS-Image
Model=Qwen2.5-VL-32B
2025.12
55.6
21.2
37.3
16.6
45.2
76.5
74
57.9
TS-Text
Model=Qwen2.5-14B-Inst...
2025.12
29.4
5.7
15.1
3.2
21.8
63.1
61.4
37.2
Feedback
Search any
task
Search any
task