Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-form writing on LongBench-Write-en [2k, 4k)
Loading...
78.79
Sl Score
Qwen-2.5-14B
12.3444
29.5947
46.845
64.0953
Feb 4, 2025
Sl Score
Sq Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Sl Score
Sq Score
Qwen-2.5-14B
tuning=LongDPO
2025.02
78.79
89.01
Qwen-2.5-14B
tuning=Base
2025.02
71.38
87.5
Qwen-2.5-14B
tuning=DPO
2025.02
68.94
86.5
GPT-4o
2025.02
53
92.8
QWQ
2025.02
33.13
93.75
Llama3.1-8B-instruct
2025.02
29.2
76.1
Llama3.1-70B-instruct
2025.02
14.9
84.5
Feedback
Search any
task
Search any
task