Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Algorithms to estimate Shapley value feature attributions

About

Feature attributions based on the Shapley value are popular for explaining machine learning models; however, their estimation is complex from both a theoretical and computational standpoint. We disentangle this complexity into two factors: (1)~the approach to removing feature information, and (2)~the tractable estimation strategy. These two factors provide a natural lens through which we can better understand and compare 24 distinct algorithms. Based on the various feature removal approaches, we describe the multiple types of Shapley value feature attributions and methods to calculate each one. Then, based on the tractable estimation strategies, we characterize two distinct families of approaches: model-agnostic and model-specific approximations. For the model-agnostic approximations, we benchmark a wide class of estimation approaches and tie them to alternative yet equivalent characterizations of the Shapley value. For the model-specific approximations, we clarify the assumptions crucial to each method's tractability for linear, tree, and deep models. Finally, we identify gaps in the literature and promising future research directions.

Hugh Chen, Ian C. Covert, Scott M. Lundberg, Su-In Lee• 2022

Related benchmarks

TaskDatasetResultRank
Faithfulness EvaluationAG News (test)
Rate of Label Changes15
24
Faithfulness EvaluationIMDB (test)
Rate of Label Changes30
24
Faithfulness EvaluationSST-2 (test)
Rate of Label Changes32
24
Keyword PredictionSST-2
Precision54.3
8
Explanation FaithfulnessJailbreaking GCG (test)
Rate of Label Changes11
8
Explanation FaithfulnessAutoDAN (test)
Label Change Rate15
8
Explanation FaithfulnessJailbreaking DAN (test)
Label Change Rate33
8
Keyword PredictionIMDB
Precision29.5
8
Keyword PredictionAG-News
Precision52.8
8
Keyword PredictionGCG jailbreaking prompts (test)
Precision65.1
4
Showing 10 of 12 rows

Other info

Follow for update