| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Image-Text Retrieval | General Domain | Retrieval Score31.27 | 30 | |
| Image Classification | General Domain 31 tasks | CLS Score57.97 | 30 | |
| Instruction following | General Domain AlpacaEval Arena-Hard LLaMA3-8B (10% selection) | AlpacaEval Score12.09 | 18 | |
| Chinese-to-English speech translation | General-domain (test) | BLEU40.77 | 6 | |
| Question Answering | General Domain Average | Average EM42.35 | 5 | |
| Language Modeling | General Domain (holdout test) | L_inf0.79 | 4 |