Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Flagging high human disagreement cases on Measuring Hate Speech corpus
Loading...
40.1
Precision (High Disagreement)
Category-based escalation
33.028
34.864
36.7
38.536
Apr 4, 2026
Precision (High Disagreement)
Recall (High Disagreement)
F1 Score (High Disagreement)
Updated 12d ago
Evaluation Results
Method
Method
Links
Precision (High Disagreement)
Recall (High Disagreement)
F1 Score (High Disagreement)
Category-based escalation
2026.04
40.1
84.5
54.8
Divergence Only
2026.04
34.7
91.5
50.3
Random Baseline
2026.04
33.3
50.5
40.1
Feedback
Search any
task
Search any
task