| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| NLP Task Suite (Capitalize, Country-Capital, Present-Past, Singular-Plural, Person-Sport, AG News) (test) | ICL baseline | Capitalize99.9 | 20 | 1mo ago | |
| 12 Downstream Classification Tasks | StateX | Accuracy53 | 15 | 11d ago | |
| 9-dataset Average (SST-5, MNLI, CMSQA, HellaSwag, GeoQ, NL2Bash, Break, MTOP, SMCalFlow) (test) | CLG | Accuracy68.08 | 15 | 1mo ago | |
| Fineweb-Edu 16.8B tokens | Spectra-AdEMAMix | ARC-c Accuracy36.86 | 8 | 1mo ago | |
| ChemBench | ABMLL-MetaICL | Accuracy58.4 | 6 | 16d ago | |
| LegalBench | ABMLL-MetaICL | Accuracy79.5 | 6 | 16d ago | |
| Llama3-8B Scenario 5 ICL prompts | LTV | Accuracy82.8 | 3 | 6d ago | |
| Llama3-8B Scenario 4: More layers & Pos. (P={-5,...}, L={0,4,...}) | LTV | Accuracy46.38 | 3 | 6d ago | |
| Llama3-8B Scenario 3: More layers (L={0,4,8,...}) | LTV | Accuracy80.43 | 3 | 6d ago | |
| Llama3-8B Scenario 2: More Pos. (P={-5,...,-1}) | LTV | Accuracy78.18 | 3 | 6d ago | |
| Llama3-8B Scenario 1: Diff. Pos. (P={4}) | LTV | Accuracy74.1 | 3 | 6d ago | |
| Llama3-8B Baseline (P={-1}, L={14}) | LTV | Accuracy78.65 | 3 | 6d ago |