| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| VGGSound (test) | Foley-Flow | FAD0.52 | 83 | 9d ago | |
| VGGSound | MMAudio-L | FD_VGG0.97 | 22 | 1mo ago | |
| VGG-Sound | Fréchet Distance (FD)0 | 10 | 16d ago | ||
| EchoFoley 6k | EchoVidia | Temporal Control Score72 | 9 | 1mo ago | |
| LongVale | MMHNet - L | FD (VGG)3.23 | 8 | 1mo ago | |
| UnAV100 | MMHNet - L | FD (VGG)1.8 | 8 | 1mo ago | |
| MUSIC (test) | Overall Score4.3 | 8 | 1mo ago | ||
| VGGSound sparse (test) | Alignment4.82 | 8 | 1mo ago | ||
| VGGSound original (test) | DIFF-FOLEY | Inception Score62.37 | 8 | 1mo ago | |
| OGameData (test) | FD0 | 7 | 16d ago | ||
| FoleyBench (test) | FD0 | 7 | 16d ago | ||
| AudioCanvas (out-of-domain) | PrismAudio | CLAP52 | 7 | 1mo ago | |
| Kling-Eval (test) | V-AURA | FDPaSST474.56 | 7 | 1mo ago | |
| VGGSound-Director (test) | FD (VGG)0 | 6 | 27d ago | ||
| VGGSound 10 (test) | MMAudio | FAD5.32 | 4 | 23d ago | |
| VisualSound (test) | V-AURA | KLD1.76 | 4 | 1mo ago | |
| Human Evaluation V2A | ReWaS | Audio Quality3.7 | 4 | 1mo ago | |
| VAS (test) | V-AURA | KLD1.98 | 3 | 1mo ago | |
| Kling-Audio-Eval | Omni2Sound | KL Divergence2.47 | 3 | 1mo ago | |
| Greatest Hits | CondFoleyGen | Accuracy23.94 | 2 | 1mo ago | |
| Video-to-Audio (test) | - | - | 0 | 1mo ago |