Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context Reasoning on LongBench and NIAH (test)
Loading...
29.33
MultiFieldQA Score
SmolLM-DroPE
3.018
9.849
16.68
23.511
Dec 13, 2025
MultiFieldQA Score
MuSiQue Score
GovReport Score
LCC Score
NIAH Retrieval Score
Average Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
MultiFieldQA Score
MuSiQue Score
GovReport Score
LCC Score
NIAH Retrieval Score
Average Score
SmolLM-DroPE
Zero-shot length gener...
2025.12
29.33
7.93
21.87
18.56
74.92
30.52
SmolLM + YaRN
Zero-shot length gener...
2025.12
20.78
4.77
15.03
10.87
48.25
19.94
SmolLM + RoPE-NTK
Zero-shot length gener...
2025.12
18.87
4.89
23.71
8.26
29.84
17.11
SmolLM + PI
Zero-shot length gener...
2025.12
13.68
2.45
5.67
11.52
0
6.66
SmolLM
Zero-shot length gener...
2025.12
4.03
0.4
4.48
5.99
0
2.98
Feedback
Search any
task
Search any
task