Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Target-guided proactive dialogue generation on DuRecDial OOD 2.0 (test)
Loading...
6.46
PPL
T5-Flan
6.4044
6.7797
7.155
7.5303
May 12, 2026
PPL
W. F1
BLEU-1
BLEU-2
DIST-1
DIST-2
K. F1
Failure Rate
Updated 21d ago
Evaluation Results
Method
Method
Links
PPL
W. F1
BLEU-1
BLEU-2
DIST-1
DIST-2
K. F1
Failure Rate
T5-Flan
2026.05
6.46
37.46
33.9
23.5
1.5
5.3
41.21
22.12
Our
mode=soft
2026.05
7.13
39.84
36.5
25.3
2.1
7.9
49.27
23.68
Our
mode=hard
2026.05
7.27
39.61
36.3
25.2
2.2
8.2
49.44
23.68
TPDial
repro=⋄
2026.05
7.67
31.71
27.3
17.8
2.3
7.8
16.02
99.69
TRIPDial
repro=⋄
2026.05
7.85
30.18
27.2
18.2
2.2
7
30.17
73.21
Feedback
Search any
task
Search any
task