| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Composed Image Retrieval | Standard Benchmarks CIRR, FashionIQ, GeneCIS | Average Performance38.3 | 10 | |
| Language Modeling and Question Answering | Standard Benchmarks (ARC-E, ARC-C, BoolQ, HellaSwag, OBQA, PIQA, WinoGrande, MMLU, SciQ) (test) | ARC-E Acc (Norm)49.75 | 8 | |
| Text-to-image | Standard text-to-image benchmarks | CLIP Score97.28 | 6 |