Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Student

Benchmarks

Task NameDataset NameSOTA ResultTrend
Solution SimulationStudent_15 1.0 (test)
ROUGE-L27.37
72
Solution SimulationStudent 100 1.0 (test)
ROUGE-L0.3034
72
Faithful Narrative Generationstudent
RA97.5
16
Selective ClassificationStudent Port
AU-ARC92.76
15
Solution SimulationStudent_15 v1.0 (test)
Con23.65
15
RF compressionstudent
Performance85.7
14
Model Compressionstudent
Performance85.7
13
Model Compressionstudent
Accuracy / R2 Score86
13
ClassificationStudent
AUC98.4
8
ClassificationStudent three random (train-test)
Accuracy94.3
8
NegotiationStudent (held-out test)
Success Rate100
7
Explanation GenerationStudent
PPL4.56
7
Multi-label classificationstudent
Accuracy19.62
7
NegotiationStudent
Success Rate100
6
Tutor Robustness EvaluationStudent w/ Reasoning
Student Leakage16
3
Counterfactual Dataset GenerationStudent (test)
GT Found Count24
3
Decision Tree Rashomon Set constructionStudent
Runtime (Time)2.43
2
Decision Tree Rashomon Set CalculationStudent
Runtime2.43
2
Student Misconception AnalysisStudent_1361 OS_124
Accuracy12.4
2
Student Misconception AnalysisStudent 1361 Java 200 (subset)
Accuracy17.5
2
Student Misconception AnalysisStudent_1361 Discrete_106
Accuracy26.5
2
ClassificationStudent random splits (test)
Relative Error Rate Reduction0.9
2
Decision Tree Rashomon Set EnumerationStudent-48
PRAXIS Count56,920,256
1
Fairness-aware classificationstudent_por (test)
Metric-
0
Showing 24 of 24 rows