Share your thoughts, 1 month free Claude Pro on usSee more

Pedagogical Dialogue Classification on MRBench (test)

91Mistake ID Acc

S5: HPO-FT

Updated 5mo ago

Evaluation Results

Method	Links
S5: HPO-FT 2025.12		91	86	89	83	84.5
S4: HPO-Base 2025.12		90	84	87	81	82.5
GPT-4o 2025.12		88	82	85	80	81.2
S3: Unstructured 2025.12		88	82	85	78	80
S2: Cooperative 2025.12		86	80	83	77	78.5
Llama-70B 2025.12		85	78	81	74	76
S1: Single 2025.12		79	71	76	68	69.5