Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-context Reasoning on LongBench and NIAH (test)
Loading...
29.33
MultiFieldQA Score
SmolLM-DroPE
3.018
9.849
16.68
23.511
Dec 13, 2025
MultiFieldQA Score
MuSiQue Score
GovReport Score
LCC Score
NIAH Retrieval Score
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
MultiFieldQA Score
MuSiQue Score
GovReport Score
LCC Score
NIAH Retrieval Score
Average Score
SmolLM-DroPE
Zero-shot length gener...
2025.12
29.33
7.93
21.87
18.56
74.92
30.52
SmolLM + YaRN
Zero-shot length gener...
2025.12
20.78
4.77
15.03
10.87
48.25
19.94
SmolLM + RoPE-NTK
Zero-shot length gener...
2025.12
18.87
4.89
23.71
8.26
29.84
17.11
SmolLM + PI
Zero-shot length gener...
2025.12
13.68
2.45
5.67
11.52
0
6.66
SmolLM
Zero-shot length gener...
2025.12
4.03
0.4
4.48
5.99
0
2.98
Feedback
Search any
task
Search any
task