| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Image Classification | CLIP benchmark (test) | Accuracy93 | 72 | |
| Deferral-advice | CLIP prompt-escalation (val) | True Deferral-Advice Loss0.337 | 66 | |
| Action Category Classification | CLIP | F1 Score88.7 | 56 | |
| Membership Inference | CLIP image-text (train) | Precision91.68 | 36 | |
| Image Classification | CLIP 20-task suite | Average Top-1 Accuracy93.1 | 24 | |
| Image Classification | CLIP 14-task suite | Average Top-1 Accuracy92.8 | 24 | |
| Image Classification | CLIP Zero-shot Evaluation Suite (10 datasets) | Cars Accuracy89.6 | 16 | |
| Attention-only performance evaluation | CLIP | Speedup2.15 | 14 | |
| Image Classification | CLIP 8-task suite | Average Top-1 Accuracy94.3 | 12 | |
| Image Encoder Inference | CLIP | Latency (ms)2.15 | 12 | |
| Multilabel Classification | CLIP (test) | Micro F180.9 | 12 | |
| Image Classification | CLIP Classification Suite | CIFAR-10 Accuracy97.2 | 11 | |
| Classification | CLIP 21-dataset Zero-Shot standard (test val) | ImageNet Accuracy45 | 11 | |
| Actionability Filtering | CLIP | Binary F1 Score86.6 | 7 | |
| Continual Learning | CLIP CL | BWT14.86 | 6 | |
| EQ Restoration | 7000 clip Short Segments (test) | Clarity0.0421 | 6 | |
| Classification | CLIP 200 random samples (test) | Macro F1 Score0.24 | 6 | |
| Lyric Intelligibility Prediction | CLIP (evaluation) | RMSE (%)27.07 | 3 | |
| Lyric Intelligibility Prediction | CLIP (val) | RMSE (%)27.13 | 3 | |
| Image Denoising | Clip300 sigma=60 | Average PSNR (dB)25.51 | 3 | |
| Image Denoising | Clip300 sigma=15 | Average PSNR (dB)31.68 | 3 | |
| Multi-shot Backdoor Classification | CLIP Multi-shot Downstream | BA70.27 | 2 |