Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-form writing on LongBench-Write-en ([4k, 20k))
Loading...
21.5
Sl Score
Qwen-2.5-14B
-0.86
4.945
10.75
16.555
Feb 4, 2025
Sl Score
Sq Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Sl Score
Sq Score
Qwen-2.5-14B
tuning=LongDPO
2025.02
21.5
86.04
Qwen-2.5-14B
tuning=Base
2025.02
19.57
82.35
Qwen-2.5-14B
tuning=DPO
2025.02
18.33
80.2
GPT-4o
2025.02
6.2
81.2
QWQ
2025.02
0.26
89.39
Llama3.1-8B-instruct
2025.02
0
57.6
Llama3.1-70B-instruct
2025.02
0
78
Feedback
Search any
task
Search any
task