| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| VGGSound (test) | GMS-CAVP | FAD0.75 | 62 | 4d ago | |
| EchoFoley 6k | EchoVidia | Temporal Control Score72 | 9 | 4d ago | |
| LongVale | MMHNet - L | FD (VGG)3.23 | 8 | 4d ago | |
| UnAV100 | MMHNet - L | FD (VGG)1.8 | 8 | 4d ago | |
| MUSIC (test) | Overall Score4.3 | 8 | 4d ago | ||
| VGGSound sparse (test) | Alignment4.82 | 8 | 4d ago | ||
| VGGSound original (test) | DIFF-FOLEY | Inception Score62.37 | 8 | 4d ago | |
| Kling-Eval (test) | V-AURA | FDPaSST474.56 | 7 | 4d ago | |
| VGGSound | MMAudio-L | FD_VGG0.97 | 6 | 4d ago | |
| VisualSound (test) | V-AURA | KLD1.76 | 4 | 4d ago | |
| Human Evaluation V2A | ReWaS | Audio Quality3.7 | 4 | 4d ago | |
| VAS (test) | V-AURA | KLD1.98 | 3 | 4d ago | |
| Kling-Audio-Eval | Omni2Sound | KL Divergence2.47 | 3 | 4d ago | |
| Greatest Hits | CondFoleyGen | Accuracy23.94 | 2 | 4d ago | |
| Video-to-Audio (test) | - | - | 0 | 4d ago |