Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tutor Robustness Evaluation on Base Student Adv. Agent
Loading...
18
Student Leakage
MathDial-SFT
16.08
29.04
42
54.96
Apr 20, 2026
Student Leakage
Student Dialogue Turns
Tutor Leakage
Tutor Dialogue Turns
Updated 1mo ago
Evaluation Results
Method
Method
Links
Student Leakage
Student Dialogue Turns
Tutor Leakage
Tutor Dialogue Turns
MathDial-SFT
Tutor Setting=Base in-...
2026.04
18
4.77
40
5.85
SocraticLM
Tutor Setting=Base in-...
2026.04
37
5.33
34
2.49
TutorRL-7B
Tutor Setting=Base in-...
2026.04
66
5.95
17
8.74
Feedback
Search any
task
Search any
task