Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Evaluation on ConsistentChat (test)
Loading...
7.3
G-E Score
MDS
6.676
6.838
7
7.162
Apr 9, 2026
G-E Score
Ent-F1
Updated 9d ago
Evaluation Results
Method
Method
Links
G-E Score
Ent-F1
MDS
selection_source=Banki...
2026.04
7.3
30
Random Data
selection_source=Banki...
2026.04
7.22
29
Heuristic
selection_source=Banki...
2026.04
7.22
27.8
DialScore
selection_source=Banki...
2026.04
7.2
29.1
Rethinking
selection_source=Banki...
2026.04
7.18
28.2
ZIP
selection_source=Banki...
2026.04
7.16
28.5
CC-Score
selection_source=Banki...
2026.04
7.16
28.5
All Data
selection_source=Banki...
2026.04
7.12
28.8
SuperFiltering
selection_source=Banki...
2026.04
7.12
28.3
Backbone
selection=none (base m...
2026.04
6.7
22.2
Feedback
Search any
task
Search any
task