Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AITA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Annotator ModelingAITA situation
Accuracy68.1
19
Verdict PredictionAITA 1.0 (author split)
Accuracy85.6
14
Judgment PredictionAITA verdict
Accuracy85.9
14
Fine-grained Moral SteeringAITA
Rho0.966
8
Moral SteeringAITA (test)
Deviation (alpha_U=100%)-15.92
8
Sycophancy EvaluationAITA
Sycophancy Score (S) PD-L0.54
6
Causal Effect EstimationAITA comments
ΔATE3.43
6
Causal EstimationAITA anger
ΔATE154.61
6
Showing 8 of 8 rows