| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Speculative decoding evaluation | OOD Mean | Speedup5.21 | 20 | |
| Unsupervised Object Segmentation | OOD 1.0 (test) | FG-ARI7,824 | 16 | |
| OOD Detection | OOD | AUC (Confidence)0.822 | 9 | |
| Speculative Decoding | OOD | Block Efficiency2.13 | 5 | |
| Defective Dialog Detection | OOD Shopping n = 105 (test) | Precision48 | 5 | |
| Unsupervised image annotation | OOD set | NMI0.54 | 5 | |
| Referential Communication | OOD set | Accuracy92.7 | 5 | |
| Open-ended Dialogue | OOD Average | Win Rate60.5 | 4 | |
| Table Understanding | OOD Table S2 (test) | ROUGE-L40.38 | 4 | |
| Table Understanding | OOD Table S1 (test) | Accuracy80.2 | 4 |