| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video-to-Audio Retrieval | Internal V→A | Recall@150.3 | 10 | |
| Next token prediction | internal | MRR0.615 | 10 | |
| Mask-based video object insertion | Internal (test) | MSE103.09 | 9 | |
| TA | Internal ID (train) | RM ACC0.7902 | 8 | |
| View-level Stenosis Classification | Internal (test) | AUC0.699 | 8 | |
| Patient-level Stenosis Classification | Internal (test) | AUC84.5 | 8 | |
| MRI Synthesis (T2 to T1c) | Internal | MSE0.0088 | 8 | |
| MRI Synthesis (T2 to T1) | Internal | MSE0.0105 | 8 | |
| MRI Synthesis (T1c to T2) | Internal | MSE0.008 | 8 | |
| MRI Synthesis (T1c to T1) | Internal | MSE0.0172 | 8 | |
| MRI Synthesis (T1 to T2) | Internal | MSE0.0082 | 8 | |
| MRI Synthesis (T1 to T1c) | Internal | MSE0.0099 | 8 | |
| Anomaly Detection | Internal | EER11 | 8 | |
| Video Frame Interpolation | Internal Dataset (10-fold cross-validation) | MAE3.66 | 6 | |
| Segment-level Stenosis Classification | Internal (test) | AUC0.799 | 5 | |
| Artery-level Stenosis Classification | Internal (test) | RCA AUC0.812 | 5 | |
| Speech denoising | internal dataset (test) | PESQ (WB)2.45 | 5 | |
| RECIST Measurement | Internal (hold-out test) | RECIST Error0.111 | 4 | |
| Text-to-Image Retrieval | internal-8M | Accuracy68 | 4 | |
| Voice Empathy | Internal (test) | Semantics-based Empathy4.8 | 3 | |
| Speech Instruction-Following | Internal (test) | Instruction Following Score4.53 | 3 |