Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AVSpeech

Benchmarks

Task NameDataset NameSOTA ResultTrend
Speech-to-PortraitAVSpeech (test)
L1 Error31.26
6
Visual Acoustic MatchingAVSpeech-Rooms unseen environments (test)
RTE (s)0.071
5
Audio-Visual Speech RecognitionAVSpeech (1,000 manually filtered samples)
WER25
4
Audio-Visual Speech ExtractionAVSpeech
SI-SDR (dB)10.2
1
Showing 4 of 4 rows