Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ZK-Value: A Practical Zero-Knowledge System for Verifiable Data Valuation

About

Data valuation is a foundational task in data marketplaces, where a Shapley-value attribution determines how a buyer's payment is distributed among data providers. Typically, the marketplace operator runs this attribution alone, requiring participants and external auditors to trust scores they cannot independently recompute on the underlying private data. While zero-knowledge proofs (ZKPs) can theoretically reconcile this conflict between privacy and verifiability, existing ZK valuation systems fail to scale to real-world marketplace demands due to prohibitive proving times or the requirement to disclose validation cohorts. We present ZK-Value, a practical, end-to-end ZK data-valuation system. Our solution bridges the scalability gap through a fully co-designed architecture: (1) LSH-Shapley, a locality-based valuation primitive that replaces expensive pairwise distance metrics with per-bucket collision counts; (2) ZK-LSH-Shapley, a tailored ZKP protocol that drastically reduces witness size by encoding these counts into bucket-level histograms rather than naive per-pair tensors; and (3) structural proof-system optimizations, specifically super-oracle batching and sparsity skipping. Evaluated across 12 standard datasets, ZK-Value delivers valuation quality on par with state-of-the-art baselines (within 0.033 AUROC of exact KNN-Shapley), while generating proofs in seconds to minutes and outperforming specialized ZK baselines by 12.6x to 68.1x in proving time, with verification in under 4.6 s.

Zhaoyu Wang, Pingchuan Ma, Zhantong Xue, Yuguang Zhou, Qixin Zhang, Xiaoqin Zhang, Shuai Wang (1) __INSTITUTION_7__ HKUST, Hong Kong SAR, (2) Zhejiang University of Technology, Hangzhou, China, (3) Nanyang Technological University, Singapore)• 2026

Related benchmarks

TaskDatasetResultRank
Data ValuationMNIST--
48
Misclassification DetectionCIFAR-10
AUROC99.8
31
Anomaly DetectionFraud--
31
Noisy label detectionphoneme
AUC0.883
18
Noisy label detectionClick
AUC0.723
18
Noisy DetectionWind
AUROC81.8
17
Mislabel Detection2Dplanes
AUROC0.938
17
Mislabel DetectionWind
AUROC88.9
17
Mislabel DetectionCPU
AUROC95.2
17
Mislabel DetectionFraud
AUROC94.1
17
Showing 10 of 46 rows

Other info

Follow for update