Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Elec2Deb

Benchmarks

Task NameDataset NameSOTA ResultTrend
Logical Fallacy TutoringElec2Deb20 (normal students)
Divergence69.3
3
Dialogue EvaluationElec2Deb20 Normal students
Divergence83
2
Dialogue EvaluationElec2Deb20 Normal Students 1.0 (test)
Divergence86
2
Logical Fallacy TutoringElec2Deb20 adversarial student
Divergence Rate11.9
2
Logical Fallacy TutoringElec2Deb20 Human Evaluation (pilot study)
Divergence1.65
2
Showing 5 of 5 rows