Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Logical Fallacy Tutoring on Elec2Deb20 (normal students)
Loading...
69.3
Divergence
BASE
68.688
72.819
76.95
81.081
May 31, 2026
Divergence
Stance Change
Repetition
Lack of Refutation
Lack of Evidence Inquiry
Strategy Fixation
Unexplained LF Terms
Passive Guidance
Average Performance
Updated 1d ago
Evaluation Results
Method
Method
Links
Divergence
Stance Change
Repetition
Lack of Refutation
Lack of Evidence Inquiry
Strategy Fixation
Unexplained LF Terms
Passive Guidance
Average Performance
BASE
Backbone=GPT-4o, Evalu...
2026.05
69.3
8.7
13
54.9
11.1
43.4
49.2
4.4
31.2
BASE W/ PROBLEMS
Backbone=GPT-4o, Evalu...
2026.05
74
48.1
42.5
99.9
95.7
63
27.6
41.5
61.5
LFTutor
Backbone=GPT-4o, Evalu...
2026.05
84.6
87.9
78.3
99.6
96.1
91.2
95
43.6
84.5
Feedback
Search any
task
Search any
task