Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Audio-Visual Speech Recognition on LRS3
Loading...
0.008
WER
Llama-AVSR
-3.83168
22.08616
48.004
73.92184
Feb 22, 2026
Feb 26, 2026
Mar 3, 2026
Mar 8, 2026
Mar 13, 2026
Mar 18, 2026
Mar 23, 2026
WER
Updated 25d ago
Evaluation Results
Method
Method
Links
WER
Llama-AVSR
Labelled hours=680,000...
2026.02
0.008
Whisper-Flamingo
Labelled hours=680,000...
2026.02
0.008
USR 2.0
Labelled hours=656, Un...
2026.02
0.008
Auto-AVSR
Labelled hours=3,448,...
2026.02
0.009
LP Conf
Labelled hours=100,000...
2026.02
0.009
Auto-AVSR
Labelled hours=1,902,...
2026.02
0.01
USR
Labelled hours=433, Un...
2026.02
0.011
ViT3D-CM
Labelled hours=90,000,...
2026.02
0.016
RNN-T
Labelled hours=31,000,...
2026.02
0.045
Whisper-flamingo
Preprocessing=Lip crop...
2026.03
76
HumanOmni-Speaker
Preprocessing=Raw video
2026.03
76
Llama-AVSR
Preprocessing=Lip crop...
2026.03
77
AutoAVSR
Preprocessing=Lip crop...
2026.03
90
Llama-SMoP
Preprocessing=Lip crop...
2026.03
96
Feedback
Search any
task
Search any
task