Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Social Science Measurement on Argument Quality
Loading...
0.374
Expected Calibration Error (ECE)
BERT
0.36776
0.40988
0.452
0.49412
May 12, 2026
Expected Calibration Error (ECE)
Brier Score
Updated 21d ago
Evaluation Results
Method
Method
Links
Expected Calibration Error (ECE)
Brier Score
BERT
distillation=soft label
2026.05
0.374
0.208
GPT-5-nano
Verbal=true
2026.05
0.53
0.356
Feedback
Search any
task
Search any
task