Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tutor Robustness Evaluation on Manually Defined Prompts
Loading...
0
Student Leakage
MathDial-SFT
-0.001
-0.0005
0
0.0005
Apr 20, 2026
Student Leakage
Student Turn Count
Tutor Leakage
Tutor Turn Count
Updated 1mo ago
Evaluation Results
Method
Method
Links
Student Leakage
Student Turn Count
Tutor Leakage
Tutor Turn Count
MathDial-SFT
Tutor Setting=Base in-...
2026.04
0
-
67
5.94
SocraticLM
Tutor Setting=Base in-...
2026.04
0
-
64
2.31
TutorRL-7B
Tutor Setting=Base in-...
2026.04
0
-
75
7.84
Feedback
Search any
task
Search any
task