Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Audio-visual generation on MultiDialog (test)
Loading...
0.624
SIM
Proposed
0.1768
0.2929
0.409
0.5251
Jun 12, 2024
SIM
FID
LSE-C
LSE-D
Updated 4d ago
Evaluation Results
Method
Method
Links
SIM
FID
LSE-C
LSE-D
Proposed
System Type=Audio-Visu...
2024.06
0.624
30.323
7.298
7.39
AVSR + LM + TTS + TFG
System Type=Cascaded S...
2024.06
0.433
30.581
7.041
7.64
d-GSLM
System Type=Spoken Dia...
2024.06
0.211
-
-
-
SpeechGPT
System Type=Spoken Dia...
2024.06
0.194
-
-
-
Feedback
Search any
task
Search any
task