Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Response Generation on Dialogue dataset
Loading...
3.67
Coherence
Ground Truth
2.8172
3.0386
3.26
3.4814
Apr 8, 2026
Coherence
Fluency
Informativeness
Helpfulness
Overall Score
Updated 9d ago
Evaluation Results
Method
Method
Links
Coherence
Fluency
Informativeness
Helpfulness
Overall Score
Ground Truth
2026.04
3.67
4.42
3.81
4.32
4.06
Qwen3-8B
Fine-tuning=DRCR
2026.04
3.41
4.36
3.87
4.12
3.94
SS-MPC
2026.04
3.18
4.12
3.53
3.87
3.68
Qwen3-8B
Fine-tuning=SFT
2026.04
3.12
4.03
3.51
3.83
3.62
RL-TRC
2026.04
3.01
3.89
3.4
3.62
3.48
MADNet
2026.04
2.85
3.64
3.19
3.37
3.26
Feedback
Search any
task
Search any
task