| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Solution Simulation | Student_15 1.0 (test) | ROUGE-L27.37 | 72 | |
| Solution Simulation | Student 100 1.0 (test) | ROUGE-L0.3034 | 72 | |
| Faithful Narrative Generation | student | RA97.5 | 16 | |
| Selective Classification | Student Port | AU-ARC92.76 | 15 | |
| Solution Simulation | Student_15 v1.0 (test) | Con23.65 | 15 | |
| RF compression | student | Performance85.7 | 14 | |
| Model Compression | student | Performance85.7 | 13 | |
| Model Compression | student | Accuracy / R2 Score86 | 13 | |
| Classification | Student | AUC98.4 | 8 | |
| Classification | Student three random (train-test) | Accuracy94.3 | 8 | |
| Negotiation | Student (held-out test) | Success Rate100 | 7 | |
| Explanation Generation | Student | PPL4.56 | 7 | |
| Multi-label classification | student | Accuracy19.62 | 7 | |
| Negotiation | Student | Success Rate100 | 6 | |
| Tutor Robustness Evaluation | Student w/ Reasoning | Student Leakage16 | 3 | |
| Counterfactual Dataset Generation | Student (test) | GT Found Count24 | 3 | |
| Decision Tree Rashomon Set construction | Student | Runtime (Time)2.43 | 2 | |
| Decision Tree Rashomon Set Calculation | Student | Runtime2.43 | 2 | |
| Student Misconception Analysis | Student_1361 OS_124 | Accuracy12.4 | 2 | |
| Student Misconception Analysis | Student 1361 Java 200 (subset) | Accuracy17.5 | 2 | |
| Student Misconception Analysis | Student_1361 Discrete_106 | Accuracy26.5 | 2 | |
| Classification | Student random splits (test) | Relative Error Rate Reduction0.9 | 2 | |
| Decision Tree Rashomon Set Enumeration | Student-48 | PRAXIS Count56,920,256 | 1 | |
| Fairness-aware classification | student_por (test) | Metric- | 0 |