| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Natural Language Inference | ChaosNLI S easy (test) | Top-1 Accuracy62.4 | 27 | |
| Natural Language Inference | ChaosNLI S_amb (test) | Top-1 Accuracy41.8 | 27 | |
| Natural Language Inference | ChaosNLI (test) | Top-1 Accuracy50.5 | 27 | |
| Probability Calibration | ChaosNLI combined SNLI+MNLI K=3 | ECE2.15 | 26 | |
| PMI ranking estimation | ChaosNLI (500 held-out pairs) | Spearman Rho0.82 | 16 | |
| Natural Language Inference Distribution Estimation | ChaosNLI | KL Divergence0 | 12 | |
| Natural Language Inference Distribution Estimation | ChaosNLI 1.0 (test) | KL Div (Dist)0 | 8 | |
| Natural Language Inference Distribution Estimation | ChaosNLI 1.0 (dev) | KL Divergence (BERT FT)0.073 | 8 | |
| Natural Language Inference Distribution Estimation | ChaosNLI (test) | KL Divergence0.06 | 4 |