Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reference-image-conditioned joint audio-video generation on User Study (test)
Loading...
3.41
Lip-Sync Accuracy
MMControl
1.4548
1.9624
2.47
2.9776
Apr 21, 2026
Lip-Sync Accuracy
Facial Expression Realism
Action Naturalness
Text Alignment
Subject Alignment
Visual Quality
Overall Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Lip-Sync Accuracy
Facial Expression Realism
Action Naturalness
Text Alignment
Subject Alignment
Visual Quality
Overall Score
MMControl
Input=Reference image...
2026.04
3.41
3.6
3.58
3.57
3.56
3.77
3.58
Hallo3
Input=Ground-truth aud...
2026.04
3.07
3.11
3.08
3.28
3.53
3.22
3.22
HunyuanCustom
Input=Ground-truth aud...
2026.04
2.9
3.17
3.21
3.18
3.38
3.39
3.2
SadTalker
Input=Ground-truth aud...
2026.04
2.35
1.78
1.78
2.78
3.31
2.68
2.45
AniPortrait
Input=Ground-truth aud...
2026.04
1.53
1.64
1.58
2.56
3.15
2.37
2.14
Feedback
Search any
task
Search any
task