| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Needle-in-a-Haystack (test) | Accuracy100 | 56 | 2mo ago | ||
| RULER | Retrieval Accuracy (8K)98.14 | 44 | 11d ago | ||
| Needle-in-the-Haystack 10k-context | Quest | Accuracy100 | 30 | 1mo ago | |
| BRIGHT StackExchange | Claude Sonnet 4.5 | Biology Score62.3 | 29 | 26d ago | |
| Needle-in-a-Haystack | Retrieval Accuracy100 | 29 | 18h ago | ||
| RULER 16K | Ministral-3-8B-Instruct-2512-BF16 | Score93.4 | 28 | 22d ago | |
| S-NIAH | Latency (s)10.5 | 27 | 2mo ago | ||
| NIAH Single 3 | Looped Hybrid (GDN+DSA) | Accuracy (1024)100 | 22 | 21h ago | |
| NIAH 128k | Single Score24.4 | 20 | 1mo ago | ||
| NIAH 64k | Single Score49.3 | 20 | 1mo ago | ||
| Lost-in-the-Middle 30-passage contexts | PRISM-∆ | Average Exact Match62.57 | 20 | 2mo ago | |
| NIAH multivalue | FLy | Speedup4.1 | 20 | 3mo ago | |
| RULER 64K context | WindowedManifoldKV | Accuracy84.3 | 19 | 1mo ago | |
| MLDR | Ettin-Enc-1B | MLDR40.2 | 17 | 2mo ago | |
| RULER 32k context (test) | Ministral-3-8B-Instruct-2512-BF16 | Accuracy91.7 | 16 | 22d ago | |
| RULER 4k context (test) | Llama-3.1-8B-Instruct | Accuracy95 | 16 | 22d ago | |
| RULER | Accuracy80.1 | 14 | 2mo ago | ||
| NIAH-Multi | Kimi-K2 | Accuracy100 | 13 | 3mo ago | |
| S-NIAH | Llama3.1 8B Instruct | Exact Match Accuracy99.6 | 12 | 25d ago | |
| RULER 128K | Hybrid-KSA | Score71.67 | 12 | 1mo ago | |
| RULER 32K | Hybrid-KSA | RULER 32K Retrieval Score86.65 | 12 | 1mo ago | |
| RULER 4K | Hybrid-KSA | Score92.97 | 12 | 6d ago | |
| RULER-NIAH 128k | Accuracy97.2 | 9 | 14d ago | ||
| RULER-NIAH 64k | Accuracy98.8 | 9 | 14d ago | ||
| RULER-NIAH 32k | Accuracy100 | 9 | 14d ago |