| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Image Classification | Waterbirds (test) | Worst-Group Accuracy93 | 92 | |
| Image Classification | Waterbirds | WG Accuracy90.6 | 79 | |
| Object Classification | Waterbirds (test) | Worst-Group Accuracy90 | 22 | |
| Classification | Waterbirds (test) | Test Accuracy90.6 | 15 | |
| Image Classification | Waterbirds Flip (test) | Accuracy90.6 | 14 | |
| Image Classification | Waterbirds Original (test) | Accuracy98.4 | 14 | |
| Image Classification | Waterbirds original unshifted | Worst Accuracy90.8 | 10 | |
| Binary Classification | Waterbirds CB | Accuracy77.9 | 10 | |
| Image Classification | Waterbirds (test) | Avg Acc (0.5% Bias)63.64 | 10 | |
| Classification | Waterbirds 5.0 severity (test) | Accuracy66.33 | 10 | |
| Classification | Waterbirds 2.0 severity (test) | Accuracy65.23 | 10 | |
| Classification | Waterbirds severity 1.0 (test) | Accuracy0.6522 | 10 | |
| Classification | Waterbirds severity 0.5 (test) | Accuracy63.64 | 10 | |
| Classification | Waterbirds (target domain) | Avg Acc93.13 | 9 | |
| Group-imbalanced Classification | Waterbirds (test) | Balanced Error15.5 | 9 | |
| Bias discovery | Waterbirds standard (test) | Precision@10100 | 8 | |
| Domain Generalization | Waterbirds | OOD Accuracy90.5 | 8 | |
| Image Classification | WaterBirds biased scenario (test) | Average Accuracy87.42 | 7 | |
| Image Classification | Waterbirds (shifted) | Worst Accuracy93.7 | 5 | |
| Binary Classification | Waterbirds (OOD) | Accuracy69.3 | 5 | |
| Place Classification | Waterbirds (test) | Unbiased Accuracy89.2 | 5 | |
| Pointing Game | Waterbirds 100% | Pointing Game Accuracy59.27 | 4 | |
| Pointing Game | Waterbirds 95% | Pointing Game Accuracy69.38 | 4 | |
| Few-shot transfer learning | Waterbirds 10-shot (OOD) | LL98.2 | 2 | |
| Few-shot transfer learning | Waterbirds 5-shot (OOD) | LL Performance98.8 | 2 |