Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Single-Doc Question Answering on LongBench-E
Loading...
43.3
F1 Score
Teacher Model (w/ Context)
0.764
11.807
22.85
33.893
Oct 23, 2025
F1 Score
Updated 22d ago
Evaluation Results
Method
Method
Links
F1 Score
Teacher Model (w/ Context)
Backbone=Qwen3-1.7B, C...
2025.10
43.3
Mean-Pooling
Backbone=Qwen3-1.7B, C...
2025.10
39.7
Compression-Tokens (Bidirectional)
Backbone=Qwen3-1.7B, C...
2025.10
35.9
Compression-Tokens (Causal)
Backbone=Qwen3-1.7B, C...
2025.10
33.3
Mean-Pooling
Backbone=Qwen3-1.7B, C...
2025.10
32.5
Compression-Tokens (Bidirectional)
Backbone=Qwen3-1.7B, C...
2025.10
30.5
Mean-Pooling
Backbone=Qwen3-1.7B, C...
2025.10
24.2
LLMLingua2
Backbone=Qwen3-1.7B, C...
2025.10
20.7
Compression-Tokens (Causal)
Backbone=Qwen3-1.7B, C...
2025.10
19.5
Compression-Tokens (Bidirectional)
Backbone=Qwen3-1.7B, C...
2025.10
19.1
Compression-Tokens (Causal)
Backbone=Qwen3-1.7B, C...
2025.10
17.9
PCC Large
Backbone=Llama3.1-8B,...
2025.10
17.2
LLMLingua2
Backbone=Qwen3-1.7B, C...
2025.10
12.5
Teacher Model (w/o Context)
Backbone=Qwen3-1.7B, C...
2025.10
10.1
LLMLingua2
Backbone=Qwen3-1.7B, C...
2025.10
8.9
PCC Large
Backbone=Llama3.1-8B,...
2025.10
5.6
PCC Large
Backbone=Llama3.1-8B,...
2025.10
2.4
Feedback
Search any
task
Search any
task