Share your thoughts, 1 month free Claude Pro on usSee more

Spoken Dialogue System (SDS) Semantic Quality Evaluation on Eval2000 (test)

12.1ROUGE-L

Multi turn CoT E2E

Updated 5mo ago

Evaluation Results

Method	Links
Multi turn CoT E2E 2026.01		12.1	21.2	68.3	6.18	10.2	-
Multi turn CoT E2E + RLAIF (Single-Reward) 2026.01		11.9	19.9	56.5	6.33	7.1	55.4
Multi turn CoT E2E + RLAIF (Joint-Reward-v2) 2026.01		11.9	19.9	59.9	6.33	7.5	54.4
Multi turn CoT E2E + RLAIF (Joint-Reward-v1) 2026.01		11.8	19.6	61.3	6.29	8.5	52.6
Direct E2E 2026.01		8.4	302.2	51.5	5.5	24.2	-
Moshi 2026.01		8.1	136.5	57.8	5.71	21	-