Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-Context Summarization on GovReport
Loading...
22.03
ROUGE-1 Score
SAC
12.4412
14.9306
17.42
19.9094
Oct 10, 2025
Oct 25, 2025
Nov 9, 2025
Nov 24, 2025
Dec 9, 2025
Dec 24, 2025
Jan 9, 2026
ROUGE-1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
ROUGE-1 Score
SAC
Maximum input length=32K
2025.10
22.03
EPL
Maximum input length=32K
2025.10
20.4
FlashMem
Backbone=Llama 3.1 8B...
2026.01
17.23
MemGen
Backbone=Llama 3.1 8B...
2026.01
16.37
CoT-SC
Backbone=Llama 3.1 8B...
2026.01
15.41
FlashMem
Backbone=Llama 3.2 3B...
2026.01
14.55
SnapKV
Backbone=Llama 3.2 3B...
2026.01
14.43
Vanilla
Backbone=Llama 3.1 8B...
2026.01
13.95
CoT-SC
Backbone=Llama 3.2 3B...
2026.01
13.61
MemGen
Backbone=Llama 3.2 3B...
2026.01
13.44
Vanilla
Backbone=Llama 3.2 3B...
2026.01
12.92
SnapKV
Backbone=Llama 3.1 8B...
2026.01
12.81
Feedback
Search any
task
Search any
task