Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Efficient Banzhaf-Based Data Valuation for $k$-Nearest Neighbors Classification

About

Data valuation, the task of quantifying the contribution of individual data points to model performance, has emerged as a fundamental challenge in machine learning. Game-theoretic approaches, such as the Banzhaf value, offer principled frameworks for fair data valuation; however, they suffer from exponential computational complexity. We address this challenge by developing efficient algorithms specifically tailored for computing Banzhaf values in $k$-nearest neighbor ($k$NN) classifiers. We first establish the theoretical hardness of the problem by proving that it is \#P-hard. Despite this intractability, we exploit the locality properties of $k$NN classifiers to develop practical exact algorithms. Our main contribution is a dynamic programming framework that achieves significant computational improvements: we present a pseudo-polynomial algorithm with $O(Wkn^2)$ time complexity for weighted $k$NN classifiers, where $W$ is the maximum sum of top-$k$ weights, and a specialized algorithm for unweighted $k$NN that achieves $O(nk^2)$ time complexity, that is, linear in the number of data points. We also offer efficient Monte Carlo estimation methods. Extensive experiments on real-world datasets demonstrate the practical efficiency of our approach and its effectiveness in data valuation applications.

Guangyi Zhang, Lutz Oettershagen, Lixu Wang, Aristides Gionis• 2026

Related benchmarks

TaskDatasetResultRank
Mislabel Detectionapsfail
AUROC0.22
17
Mislabel DetectionFraud
AUROC11
17
Mislabel DetectionPOL
AUROC0.06
17
Mislabel DetectionWind
AUROC12
17
Mislabel Detection2Dplanes
AUROC0.21
17
Mislabel Detectionvehicle
AUROC0.08
17
Mislabel DetectionCPU
AUROC5
17
Misclassification DetectionCIFAR10
AUROC9
14
Identifying mislabeled pointsClick
Recall14
12
Identifying mislabeled pointsClick
Precision7
12
Showing 10 of 46 rows

Other info

Follow for update