| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Classification | AV-MNIST | Accuracy72.38 | 12 | |
| Digit Classification | AV-MNIST Vision modality standard (test) | Accuracy71.32 | 4 | |
| Clustering | AV-MNIST 3 modal | Gap Statistic0.24 | 3 | |
| Cross-modal Retrieval | AV-MNIST 3 modal | Gap0.09 | 3 |