| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Retrieval-Augmented Generation | All Datasets Aggregated | Average Performance Score76.6 | 40 | |
| Generalized Category Discovery | All Datasets Avg | Overall Accuracy75.1 | 12 | |
| Lesion Segmentation | All Datasets | BBox Score0.777 | 6 | |
| Image Generation | All Datasets | Fidelity54 | 4 | |
| Preference Prediction | All Datasets Total | Significant Features Count (S)43 | 2 | |
| Alpha-law validation | All datasets | Clean Accuracy31.3 | 1 |