Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tutor Robustness Evaluation on Finetuned Adv. Agent
Loading...
1
Student Information Leakage
SocraticLM
0.96
1.23
1.5
1.77
Apr 20, 2026
Student Information Leakage
Student Dialogue Turns
Tutor Information Leakage
Tutor Dialogue Turns
Updated 1mo ago
Evaluation Results
Method
Method
Links
Student Information Leakage
Student Dialogue Turns
Tutor Information Leakage
Tutor Dialogue Turns
SocraticLM
Tutor Setting=Base in-...
2026.04
1
13.52
70
2.61
MathDial-SFT
Tutor Setting=Base in-...
2026.04
2
8.21
70
6.1
TutorRL-7B
Tutor Setting=Base in-...
2026.04
2
10.94
83
8.65
Feedback
Search any
task
Search any
task