Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Discrimination on EduAgent (test)
Loading...
66.39
Accuracy
QG-SMS
51.9028
55.6639
59.425
63.1861
Mar 7, 2025
Accuracy
Classification Agreement
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Classification Agreement
QG-SMS
Scoring Protocol=Pairw...
2025.03
66.39
55.74
BERTScore
Scoring Protocol=Indiv...
2025.03
65.57
-
Metrics
Scoring Protocol=Pairw...
2025.03
65.57
50.82
Swap
Scoring Protocol=Pairw...
2025.03
64.75
45.9
Vanilla
Scoring Protocol=Pairw...
2025.03
63.11
49.18
Reference
Scoring Protocol=Pairw...
2025.03
62.3
45.9
KDA_large
Scoring Protocol=Indiv...
2025.03
60.66
-
CoT
Scoring Protocol=Pairw...
2025.03
59.84
32.79
ChatEval
Scoring Protocol=Pairw...
2025.03
54.92
42.56
QSalience
Scoring Protocol=Indiv...
2025.03
52.46
-
Feedback
Search any
task
Search any
task