Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-Context Summarization on GovReport
Loading...
17.23
ROUGE-1 Score
FlashMem
12.6332
13.8266
15.02
16.2134
Jan 9, 2026
ROUGE-1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
ROUGE-1 Score
FlashMem
Backbone=Llama 3.1 8B...
2026.01
17.23
MemGen
Backbone=Llama 3.1 8B...
2026.01
16.37
CoT-SC
Backbone=Llama 3.1 8B...
2026.01
15.41
FlashMem
Backbone=Llama 3.2 3B...
2026.01
14.55
SnapKV
Backbone=Llama 3.2 3B...
2026.01
14.43
Vanilla
Backbone=Llama 3.1 8B...
2026.01
13.95
CoT-SC
Backbone=Llama 3.2 3B...
2026.01
13.61
MemGen
Backbone=Llama 3.2 3B...
2026.01
13.44
Vanilla
Backbone=Llama 3.2 3B...
2026.01
12.92
SnapKV
Backbone=Llama 3.1 8B...
2026.01
12.81
Feedback
Search any
task
Search any
task