| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Malicious Pickle Detection | Curated Dataset standard (train-test) | TPR100 | 11 | |
| Jailbreak evaluation | curated dataset (test) | BAD BOT Rate0 | 11 | |
| Action-conditioned 4D scene generation | Curated dataset of 10 scenes (test) | Camera Control93.26 | 8 | |
| Action-conditioned 4D scene generation | Curated dataset of 10 scenes 1.0 (test) | Physics Plausibility93.5 | 7 | |
| Zero-shot Text-guided Video Editing | Curated dataset 90-frames | CLIP-F95.99 | 7 | |
| Overall Appearance Transfer Quality | Curated dataset 100 image pairs | DeQA4.1728 | 6 | |
| Material Transfer | Curated dataset 100 image pairs | CLIP-T Score0.2927 | 6 | |
| Semantic-Aware Appearance Transfer | Curated dataset 100 image pairs | CLIP-I88.32 | 6 | |
| Zero-shot Text-guided Video Editing | Curated dataset 8-frames | CLIP-F95.95 | 6 | |
| Zero-shot Text-guided Video Editing | Curated dataset 36-frames | CLIP-F9,318 | 5 | |
| Physical 3D action-conditioned video generation | Curated dataset of 30 images (test) | Action Following89.6 | 3 |