Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

NIAH

Benchmarks

Task NameDataset NameSOTA ResultTrend
Information RetrievalNIAH (test)
Average Score99.6
59
Multi-needle retrievalNIAH (M)
Accuracy (NIAH M)90.2
35
Long-context RetrievalNIAH 128k
Single Score24.4
20
Long-context RetrievalNIAH 64k
Single Score49.3
20
Long-context retrievalNIAH multivalue
Speedup4.1
20
Long Context RetrievalNIAH-Multi
Accuracy100
13
Synthetic RetrievalNIAH
Score @ 4096 Context100
9
Information RetrievalNIAH single v1.0 (test)
Accuracy100
8
Long-context retrievalNIAH (avg)
Score (4k Context)100
7
Long ContextNIAH
Accuracy99.8
6
Long-context retrievalNIAH 32k
NIAH Score99
6
Long-context retrievalNIAH 16k
NIAH Score98.6
6
Needle-in-a-haystackNIAH Needle-in-a-haystack
NIAH Success Rate (32K Context)100
6
Long-context RetrievalNIAH
Accuracy100
5
Needle-in-a-haystackNIAH 1
Success Rate (1k Context)79.69
5
Needle-in-a-haystackNIAH-2 (test)
NIAH-2 Success Rate (1k)79.61
5
Information RetrievalNIAH multikey v1.0 (test)
Accuracy98.7
4
Long Context & Context LearningNIAH@1M RULER
pass@199
4
Long-context recallNIAH Single-3
Recall @ 32K Context100
4
Long-context recallNIAH Single 2
Recall @ 32K Context1
4
Long-context recallNIAH Single-1
Recall @ 32K100
4
Needle-In-A-Haystack RetrievalNIAH Single 2
Success Rate (4K Context)94
2
Showing 22 of 22 rows