Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AVSpeech

Benchmarks

Task NameDataset NameSOTA ResultTrend
Visual Acoustic MatchingAVSpeech-Rooms unseen environments (test)
RTE (s)0.071
5
Audio-Visual Speech RecognitionAVSpeech (1,000 manually filtered samples)
WER25
4
Showing 2 of 2 rows