| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Image Classification | Colored-MNIST (CMNIST) | Accuracy94.43 | 42 | |
| Image classification | Colored MNIST unbiased (test) | Accuracy98.02 | 28 | |
| Digit Classification | Colored MNIST foreground color (test) | Unbiased Accuracy86.37 | 24 | |
| Image Classification | Colored MNIST (test) | Accuracy75 | 24 | |
| Image Classification | Colored MNIST (unbiased) | Accuracy0.9628 | 16 | |
| Image Classification | Colored MNIST (bias-guiding) | Accuracy100 | 16 | |
| Image Classification | Colored MNIST (train) | Accuracy88.9 | 14 | |
| Image Classification | Colored MNIST | Accuracy (+90% Threshold)73.1 | 9 | |
| Image Classification | Colored MNIST background color, ratio 0.95 | Bias-Aligned Accuracy99.9 | 5 | |
| Image Classification | Colored MNIST background color, ratio 0.98 | Bias-Aligned Accuracy99.97 | 5 | |
| Image Classification | Colored MNIST background color, ratio 0.99 | Bias Aligned Score100 | 5 | |
| Image Classification | Colored MNIST background color, ratio 0.995 | Aligned Accuracy1 | 5 | |
| Digit Classification | Colored MNIST (val) | Validation Accuracy (Kendall's tau)1 | 5 | |
| Color Classification | Colored-MNIST (val) | Unbiased Accuracy99.95 | 4 | |
| OOD Detection | Colored MNIST In-distribution vs Textures protocol (test) | TNR @ 95% TPR100 | 2 | |
| OOD Detection | Colored MNIST In-distribution vs iSUN protocol (test) | TNR @ 95% TPR100 | 2 | |
| OOD Detection | Colored MNIST In-distribution vs LSUN protocol (test) | TNR @ 95% TPR1 | 2 | |
| OOD Detection | Colored MNIST In-distribution vs Spurious OOD protocol (test) | TNR @ 95% TPR0.958 | 2 |