Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Gist Summarization on Gist Summarization (Tok-F1, chrF)
Loading...
30.6
Tok-F1
LatentQA
10.944
16.047
21.15
26.253
May 25, 2026
Tok-F1
chrF Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Tok-F1
chrF Score
LatentQA
Donor=Llama-3.1-8B-Ins...
2026.05
30.6
28
UAV
Donor=Qwen3-4B-Instruc...
2026.05
30.1
28.5
UAV
Donor=Qwen3-4B-Instruc...
2026.05
29.7
28.3
UAV
Donor=Llama-3.1-8B-Ins...
2026.05
29.3
27.7
AO
Donor=Llama-3.1-8B-Ins...
2026.05
29.3
27.5
LatentQA
Donor=Qwen3-4B-Instruc...
2026.05
29
27.5
UAV
Donor=Llama-3.1-8B-Ins...
2026.05
28.8
27.1
UAV
Donor=Llama-3.1-8B-Ins...
2026.05
28.5
27.5
AO
Donor=Qwen3-4B-Instruc...
2026.05
25.9
25.6
PatchScope
Donor=Llama-3.1-8B-Ins...
2026.05
13.8
18.2
SelfIE
Donor=Llama-3.1-8B-Ins...
2026.05
12.8
17.6
PatchScope
Donor=Qwen3-4B-Instruc...
2026.05
12
17.6
SelfIE
Donor=Qwen3-4B-Instruc...
2026.05
11.7
17.6
Feedback
Search any
task
Search any
task