| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text Classification | TREC | Accuracy98 | 281 | |
| Question Classification | TREC | Accuracy98.07 | 262 | |
| Reranking | TREC DL 2020 | NDCG@1070.86 | 132 | |
| Question Classification | TREC (test) | Accuracy97.53 | 128 | |
| Text Classification | TREC (test) | Accuracy97.2 | 122 | |
| Reranking | TREC 2020 (test) | NDCG@1070.9 | 55 | |
| Query Performance Prediction | TREC DL'19 and DL'20 (2-fold train-test) | AP-τ40.33 | 48 | |
| 6-way question classification | TREC 6-class (test) | Accuracy97.6 | 41 | |
| Privacy Auditing | Trec | Empirical Privacy Lower Bound (ϵ_emp)0 | 40 | |
| Reranking | TREC | NDCG@5 (DL19)74.45 | 35 | |
| Text Classification | Trec synthetic noise (test) | Accuracy97.2 | 34 | |
| Question Classification | TREC LongBench | Accuracy68.5 | 30 | |
| Question Classification | TREC | Macro-F172.75 | 30 | |
| Text Classification | TREC (val) | Top-1 Acc93.54 | 30 | |
| Information Retrieval | TREC DL | NDCG@1093.4 | 25 | |
| Question Classification | TREC | Spearman's rho (x100)78.72 | 23 | |
| End-to-end Open-Domain Question Answering | TREC (test) | Exact Match (EM)63.1 | 21 | |
| Backdoor Defense | TREC | AUC0.99 | 20 | |
| Information Retrieval | TREC Title queries 1-3 | MAP0.2873 | 19 | |
| Passage retrieval | TREC (test) | Top-20 Accuracy95.5 | 17 | |
| Question Classification | TREC 50 (test) | Accuracy97.2 | 17 | |
| Query Performance Prediction | TREC DL'19 and DL'20 (2-fold) | AP (τ)41.76 | 16 | |
| Factoid-style retrieval | TREC DL19 | NDCG@1068.8 | 16 | |
| Open-domain QA | Curated TREC | QA-F141.8 | 16 | |
| Summarization | TREC | Accuracy65.2 | 15 |