Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Argument Component Detection on Persuasive Essays (PE) (test)
Loading...
88.6
Macro F1
Human Upper Bound
63.224
69.812
76.4
82.988
Mar 3, 2026
Macro F1
Accuracy
Updated 3mo ago
Evaluation Results
Method
Method
Links
Macro F1
Accuracy
Human Upper Bound
Dataset=PE
2026.03
88.6
-
Llama-3-8B
Dataset=PE, Learning P...
2026.03
87.78
90.04
CRF with features
Dataset=PE
2026.03
86.7
-
GPT-2-1.5B
Dataset=PE, Learning P...
2026.03
85.21
88.04
OPT-6.7B
Dataset=PE, Learning P...
2026.03
85.18
88.56
MT-all
Dataset=PE
2026.03
75.66
-
DeBERTa-v3
Dataset=PE, Learning P...
2026.03
71.12
-
RoBERTa
Dataset=PE, Learning P...
2026.03
69.33
-
Heuristic Baseline
Dataset=PE
2026.03
64.2
-
Feedback
Search any
task
Search any
task