Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Listener Response Generation on Realtalk 1.0 (user study)
Loading...
4.5
Appropriateness Score
VLA model
2.628
3.114
3.6
4.086
Mar 7, 2026
Appropriateness Score
Empathy Score
Engagement Score
Naturalness Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Appropriateness Score
Empathy Score
Engagement Score
Naturalness Score
VLA model
training_stage=SFT+RL
2026.03
4.5
4.1
4.2
4.5
VLA model
training_stage=SFT
2026.03
3.2
3.4
3.7
3.3
MMLHG
2026.03
3
3.3
3.5
3.1
LM-listener
2026.03
2.7
3.1
3.4
2.9
Feedback
Search any
task
Search any
task