| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| RULER | Full attention | S-NIAH-1 (Pass-Key Retrieval)100 | 42 | 1mo ago | |
| Needle-in-a-Haystack 32K context (test) | Quest | Accuracy76 | 30 | 1mo ago | |
| Needle-in-a-Haystack 8K context (test) | Quest | Accuracy100 | 30 | 1mo ago | |
| RULER (test) | DroPE transformer | Multi-Query Success Rate2,800 | 8 | 1mo ago | |
| RULER S-NIAH-2 OOD | ASEntmax | Success Rate (4K Context)83.2 | 4 | 1mo ago | |
| RULER S-NIAH-2 (ID) | Softmax | Retrieval Success Rate (1K)100 | 4 | 1mo ago | |
| RULER S-NIAH-1 OOD | ASEntmax | Success Rate (4K Context)100 | 4 | 1mo ago | |
| RULER S-NIAH-1 ID | Softmax | Retrieval Success Rate (1K Context)100 | 4 | 1mo ago | |
| BABILong 32K context length | MemDLM (Train & Inference) | Accuracy9 | 3 | 25d ago | |
| BABILong 16K context length | MemDLM (Train & Inference) | Needle-in-a-Haystack Accuracy (16K)22.2 | 3 | 25d ago | |
| RULER 32K context length | MemDLM (Train & Inference) | RULER-MV Retrieval Score15.35 | 3 | 25d ago | |
| RULER 16K context length | MemDLM (Train & Inference) | RULER-MV Score29.4 | 3 | 25d ago | |
| NIAH Single 2 | StateX | Success Rate (4K Context)94 | 2 | 10d ago |