Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-form writing on LongBench-Write-en ([0, 500) context length)
Loading...
92.1
Sl Score
GPT-4o
88.2416
89.2433
90.245
91.2467
Feb 4, 2025
Sl Score
Sq Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Sl Score
Sq Score
GPT-4o
2025.02
92.1
93.1
Qwen-2.5-14B
tuning=LongDPO
2025.02
91.75
91.25
Qwen-2.5-14B
tuning=DPO
2025.02
91.72
90.53
Llama3.1-70B-instruct
2025.02
90.8
84.8
Llama3.1-8B-instruct
2025.02
89.7
84.6
QWQ
2025.02
89.1
94.58
Qwen-2.5-14B
tuning=Base
2025.02
88.39
89.77
Feedback
Search any
task
Search any
task