| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| FDA (test) | GA-S2 | Score0.8004 | 120 | 3mo ago | |
| RULER Context Length = 8K | Average Accuracy (RULER 8K)89.59 | 72 | 1mo ago | ||
| RULER | Score (4K)97.36 | 49 | 5d ago | ||
| RULER | Single-key Accuracy100 | 29 | 5d ago | ||
| HELMET | FullAttention | Average Sparsity0 | 28 | 3mo ago | |
| HELMET held-out eval | Qwen 2.5 32B | Accuracy (8K Context)57.61 | 13 | 3mo ago | |
| RULER 32K | Average Score (RULER 32K)88.6 | 12 | 2d ago |