| Small CNN Zoo ReLU subset (test) | HNP | Kendall’s Tau0.926 | | 35 | 1mo ago |
| WMT Benchmarks Average WMT'14 & WMT'19 (aggregation) | LLM-PP | MAE0.29 | | 16 | 3mo ago |
| WMT En-De 2019 (val) | LLM-PP | MAE0.29 | | 16 | 3mo ago |
| WMT En-Fr 2014 (val) | LLM-PP | MAE0.28 | | 16 | 3mo ago |
| WMT En-De 2014 (val) | LLM-Distill-PP | MAE0.22 | | 16 | 3mo ago |
| NAS-Bench-101 (val) | ConvNP | Accuracy94.76 | | 11 | 22d ago |
| ARC 1.2k (test) | Metabench | MAE1.14 | | 11 | 3mo ago |
| Winogrande (WG) 1.3k (test) | DISCO | MAE1 | | 11 | 3mo ago |
| HellaSwag (HS) 10k (test) | Metabench | MAE0.8 | | 11 | 3mo ago |
| MMLU 14k (test) | DISCO | MAE1.07 | | 11 | 3mo ago |
| Large Model Performance Prediction Dataset 80% masking (test) | STAR | RMSE7.5 | | 10 | 3mo ago |
| MNLI source domains (out-of-domain) | Cosine distance (fine-tuned) | ROC AUC0.683 | | 10 | 3mo ago |
| MNLI source domains (in-domain) | Cosine distance (fine-tuned) | ROC AUC0.699 | | 10 | 3mo ago |
| Sentiment temporal (out-of-domain) | Cosine distance (fine-tuned) | ROC AUC0.834 | | 10 | 3mo ago |
| Sentiment temporal (in-domain) | Cosine distance (fine-tuned) | ROC AUC0.852 | | 10 | 3mo ago |
| Sentiment categories (out-of-domain) | Cosine distance (fine-tuned) | ROC AUC0.822 | | 10 | 3mo ago |
| Sentiment categories (in-domain) | Cosine distance (fine-tuned) | ROC AUC0.845 | | 10 | 3mo ago |
| NAS-Bench-201 ImageNet16 (test) | CARL | Kendall's Tau0.63 | | 9 | 22d ago |
| NAS-Bench CIFAR-100 201 (test) | CARL | Kendall's Tau0.65 | | 9 | 22d ago |
| NAS-Bench-201 CIFAR-10 (test) | CARL | Kendall's Tau0.64 | | 9 | 22d ago |
| Tatoeba | | MAE5.82 | | 9 | 3mo ago |
| MewsliX | MAML | MAE9.33 | | 9 | 3mo ago |
| LAREQA | | MAE1.51 | | 9 | 3mo ago |
| XQUAD | MDGPR | MAE3.15 | | 9 | 3mo ago |
| TyDiQA | | MAE4.29 | | 9 | 3mo ago |