VGGSound-AVEL

Benchmarks

Task Name	Dataset Name	SOTA Result
Event Classification (V → A)	VGGSound-AVEL 90K	Precision50.8	11
Cross-modal classification (Audio to Visual)	VGGSound-AVEL UCF to VGG 40K	Precision66.5	4
Cross-dataset domain transfer (Visual to Audio)	VGGSound-AVEL 40K AVE to AVVP	Segment-level F156.3	4

Showing 3 of 3 rows