| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Audio Question Answering | ClothoAQA (test) | Accuracy71.02 | 14 | |
| Audio Understanding | ClothoAQA | CIDEr32.6 | 6 | |
| Audio Question Answering | ClothoAQA numerical | Accuracy36.4 | 2 | |
| Audio Question Answering | ClothoAQA non-binary | Accuracy49.5 | 2 | |
| Audio Question Answering | ClothoAQA unanimous | Accuracy86.9 | 2 | |
| Audio Understanding | ClothoAQA (held-out test) | CIDEr- | 0 |