Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Discrimination on DBE-KT (test)
Loading...
66.66
AA
QG-SMS
28.648
38.5165
48.385
58.2535
Mar 7, 2025
AA
CA
Updated 4d ago
Evaluation Results
Method
Method
Links
AA
CA
QG-SMS
Scoring Protocol=Pairw...
2025.03
66.66
56.99
ChatEval
Scoring Protocol=Pairw...
2025.03
65.05
53.76
Vanilla
Scoring Protocol=Pairw...
2025.03
63.98
49.46
CoT
Scoring Protocol=Pairw...
2025.03
62.9
34.41
Swap
Scoring Protocol=Pairw...
2025.03
62.9
48.39
Metrics
Scoring Protocol=Pairw...
2025.03
61.29
45.16
Reference
Scoring Protocol=Pairw...
2025.03
60.75
44.09
KDA_large
Scoring Protocol=Indiv...
2025.03
58.06
-
QSalience
Scoring Protocol=Indiv...
2025.03
47.31
-
BERTScore
Scoring Protocol=Indiv...
2025.03
30.11
-
Feedback
Search any
task
Search any
task