Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

NIAH

Benchmarks

Task NameDataset NameSOTA ResultTrend
Information RetrievalNIAH (test)
Average Score99.6
59
Multi-needle retrievalNIAH (M)
Accuracy (NIAH M)90.2
35
Long-context RetrievalNIAH Single 3
Accuracy (1024)100
22
Long-context RetrievalNIAH 128k
Single Score24.4
20
Long-context RetrievalNIAH 64k
Single Score49.3
20
Long-context retrievalNIAH multivalue
Speedup4.1
20
Needle-In-A-Haystack RetrievalNIAH
NIAH Score100
14
Long Context RetrievalNIAH-Multi
Accuracy100
13
Synthetic RetrievalNIAH
Score @ 4096 Context100
9
Long-context RetrievalNIAH Single 2
Accuracy (1024 tokens)100
8
Long-context RetrievalNIAH Single-1
Accuracy (1024)100
8
Needle-in-a-haystack retrievalNIAH 64K 60 items, 3 needle positions (test)
F1 Score28.2
8
Information RetrievalNIAH single v1.0 (test)
Accuracy100
8
Long-tail retrievalNIAH 7-client multi-needle retrieval (non-IID partition)
MK-NIAH40
7
Long-context retrievalNIAH (avg)
Score (4k Context)100
7
Runtime efficiencyNIAH (500 samples)
Latency (8K Context)34.01
6
Needle In A Haystack retrievalNIAH L=2048
Accuracy100
6
Long ContextNIAH
Accuracy99.8
6
Long-context retrievalNIAH 32k
NIAH Score99
6
Long-context retrievalNIAH 16k
NIAH Score98.6
6
Needle-in-a-haystackNIAH Needle-in-a-haystack
NIAH Success Rate (32K Context)100
6
Long-context RetrievalNIAH
Accuracy100
5
Needle-in-a-haystackNIAH 1
Success Rate (1k Context)79.69
5
Needle-in-a-haystackNIAH-2 (test)
NIAH-2 Success Rate (1k)79.61
5
Long-range retrievalNIAH 32K
Accuracy90.9
4
Showing 25 of 37 rows