Share your thoughts, 1 month free Claude Pro on usSee more

Mislabel detection on Weak Reference Labels Mislabels + subjective

76.2AP

4PL Δℓi (proposed)

Updated 1mo ago

Evaluation Results

Method	Links
4PL Δℓi (proposed) 2026.05		76.2	92.7	81
top-10 disagreement 2026.05		73.8	90.8	78.6
XGBoost (4PL params) 2026.05		70.3	88.6	74
plain 2PL (low ai) 2026.05		70	78.7	70.9
4PL, low di 2026.05		69.8	88.3	73.8
plain 4PL, single-stage (low di) 2026.05		69.8	91.5	74
GLAD 2026.05		66.1	85.1	67.7
low ri 2026.05		62.6	87.6	67.2
overall disagreement 2026.05		57.3	81.7	66.2