Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Needle-in-a-Haystack

Benchmarks

Task NameDataset NameSOTA ResultTrend
Needle-in-a-HaystackNeedle-in-a-Haystack
Accuracy100
44
Needle-in-a-Haystack RetrievalNeedle-in-a-Haystack 32K context (test)
Accuracy76
30
Needle-in-a-Haystack RetrievalNeedle-in-a-Haystack 8K context (test)
Accuracy100
30
Long-context RetrievalNeedle-in-a-Haystack
Retrieval Accuracy100
10
Long-context Information RetrievalNeedle-In-a-Haystack Verbatim prompt (test)
Accuracy (Depth 0%)0.996
7
Needle-In-a-HaystackNeedle-In-a-Haystack Gemini prompt (test)
Success Rate @ 0% Insertion57.2
7
Information RetrievalNeedle In A Haystack
Recall@1K90
6
Long-context retrievalNeedle-in-a-Haystack 1.0 (test)
Score99.9
5
Needle-in-a-haystackNeedle-in-a-haystack 8x original context
Accuracy52.2
4
Needle-in-a-haystackNeedle-in-a-haystack 4x original context
Accuracy55
4
Needle-in-a-haystackNeedle-in-a-haystack 2x original context
Needle-in-a-haystack Accuracy (2x Context)74.92
4
Long-context retrievalNeedle-in-a-Haystack (NiH)
Accuracy (512 tokens)100
3
Showing 12 of 12 rows