Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Difficulty on EduAgent (test)
Loading...
68.95
Accuracy (AA)
ChatEval
50.9164
55.5982
60.28
64.9618
Mar 7, 2025
Accuracy (AA)
Classification Accuracy (CA)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (AA)
Classification Accuracy (CA)
ChatEval
Scoring Protocol=Pairw...
2025.03
68.95
51.61
QG-SMS
Scoring Protocol=Pairw...
2025.03
68.55
65.32
Reference
Scoring Protocol=Pairw...
2025.03
66.53
51.61
Swap
Scoring Protocol=Pairw...
2025.03
66.53
54.84
Metrics
Scoring Protocol=Pairw...
2025.03
65.32
53.22
Vanilla
Scoring Protocol=Pairw...
2025.03
63.71
50.8
CoT
Scoring Protocol=Pairw...
2025.03
61.69
32.26
KDA_large
Scoring Protocol=Indiv...
2025.03
60.48
-
QSalience
Scoring Protocol=Indiv...
2025.03
54.03
-
BERTScore
Scoring Protocol=Indiv...
2025.03
51.61
-
Feedback
Search any
task
Search any
task