| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Needle-in-a-Haystack | Needle-in-a-Haystack | Accuracy100 | 44 | |
| Needle-in-a-Haystack Retrieval | Needle-in-a-Haystack 32K context (test) | Accuracy76 | 30 | |
| Needle-in-a-Haystack Retrieval | Needle-in-a-Haystack 8K context (test) | Accuracy100 | 30 | |
| Long-context Retrieval | Needle-in-a-Haystack | Retrieval Accuracy100 | 10 | |
| Long-context Information Retrieval | Needle-In-a-Haystack Verbatim prompt (test) | Accuracy (Depth 0%)0.996 | 7 | |
| Needle-In-a-Haystack | Needle-In-a-Haystack Gemini prompt (test) | Success Rate @ 0% Insertion57.2 | 7 | |
| Information Retrieval | Needle In A Haystack | Recall@1K90 | 6 | |
| Long-context retrieval | Needle-in-a-Haystack 1.0 (test) | Score99.9 | 5 | |
| Needle-in-a-haystack | Needle-in-a-haystack 8x original context | Accuracy52.2 | 4 | |
| Needle-in-a-haystack | Needle-in-a-haystack 4x original context | Accuracy55 | 4 | |
| Needle-in-a-haystack | Needle-in-a-haystack 2x original context | Needle-in-a-haystack Accuracy (2x Context)74.92 | 4 | |
| Long-context retrieval | Needle-in-a-Haystack (NiH) | Accuracy (512 tokens)100 | 3 |