Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AVSpeech

Benchmarks

Task NameDataset NameSOTA ResultTrend
Speech-to-PortraitAVSpeech (test)
L1 Error31.26
6
Visual Acoustic MatchingAVSpeech-Rooms unseen environments (test)
RTE (s)0.071
5
Audio-Visual Speech RecognitionAVSpeech (1,000 manually filtered samples)
WER25
4
Showing 3 of 3 rows