Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Text Summarization on XSum (ROUGE-2, Final Gap)
Loading...
43.1
ROUGE-2
Pioneer Agent
5.036
14.918
24.8
34.682
Apr 10, 2026
ROUGE-2
Final Gap
Updated 4d ago
Evaluation Results
Method
Method
Links
ROUGE-2
Final Gap
Pioneer Agent
Model=Qwen3-8B, System...
2026.04
43.1
36.6
Pioneer Agent
Model=Qwen3-8B, System...
2026.04
38.9
36.6
Naive Baseline
Model=Qwen3-8B, System...
2026.04
10.9
36.6
Naive Baseline
Model=Qwen3-8B, System...
2026.04
6.5
36.6
Feedback
Search any
task
Search any
task