Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Audio-Visual Speech Recognition on LRS2 (clean)
Loading...
2.2
WER
MIR-GAN
1.8648
4.1274
6.39
8.6526
Jun 18, 2023
Dec 4, 2023
May 22, 2024
Nov 8, 2024
Apr 26, 2025
Oct 13, 2025
Apr 1, 2026
WER
Updated 17d ago
Evaluation Results
Method
Method
Links
WER
MIR-GAN
Backbone=Transformer,...
2023.06
2.2
Base model
Backbone=Transformer,...
2023.06
2.3
MoCo+wav2vec
Backbone=Transformer,...
2023.06
2.6
MIR-GAN
Backbone=Conformer, Le...
2023.06
3.2
Hyb-Conformer
Backbone=Conformer, Le...
2023.06
3.7
Base model
Backbone=Conformer, Le...
2023.06
3.9
MIR-GAN
Backbone=Transformer,...
2023.06
4.5
Base model
Backbone=Transformer,...
2023.06
5.4
LF-MMI TDNN
Backbone=TDNN, Learnin...
2023.06
5.9
Hyb-RNN
Backbone=RNN, Learning...
2023.06
7
TM-CTC
Backbone=Transformer,...
2023.06
8.2
TM-seq2seq
Backbone=Transformer,...
2023.06
8.5
VisG AV-HuBERT Large
Params=484M
2026.04
9.925
AV-HuBERT Base
Params=160M
2026.04
10.09
AV-HuBERT Large
Params=477M
2026.04
10.3
VisG AV-HuBERT Base
Params=162M
2026.04
10.58
Feedback
Search any
task
Search any
task