Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ProtoTEx: Explaining Model Decisions with Prototype Tensors

About

We present ProtoTEx, a novel white-box NLP classification architecture based on prototype networks. ProtoTEx faithfully explains model decisions based on prototype tensors that encode latent clusters of training examples. At inference time, classification decisions are based on the distances between the input text and the prototype tensors, explained via the training examples most similar to the most influential prototypes. We also describe a novel interleaved training algorithm that effectively handles classes characterized by the absence of indicative features. On a propaganda detection task, ProtoTEx accuracy matches BART-large and exceeds BERT-large with the added benefit of providing faithful explanations. A user study also shows that prototype-based explanations help non-experts to better recognize propaganda in online news.

Anubrata Das, Chitrank Gupta, Venelin Kovatchev, Matthew Lease, Junyi Jessy Li• 2022

Related benchmarks

TaskDatasetResultRank
Text ClassificationAG-News
Accuracy91.5
248
Text ClassificationIMDB
Accuracy93.5
107
Text ClassificationBeer
Accuracy87.7
7
Text ClassificationHotel
Accuracy97.7
7
Text ClassificationCEBaB
Acc61
7
Human forward simulatabilityBeer (test)
Accuracy95
5
Text ClassificationTwitter
Accuracy82.6
5
Human forward simulatabilityAGNews (test)
Accuracy78.3
5
Text ClassificationSciCite
Accuracy85.2
5
Concept Comprehensibility EvaluationSciCite
Semantics Score45
4
Showing 10 of 23 rows

Other info

Follow for update