InfoDisent: Explainability of Image Classification Models by Information Disentanglement

About

In this work, we introduce InfoDisent, a hybrid approach to explainability based on the information bottleneck principle. InfoDisent enables the disentanglement of information in the final layer of any pretrained model into atomic concepts, which can be interpreted as prototypical parts. This approach merges the flexibility of post-hoc methods with the concept-level modeling capabilities of self-explainable neural networks, such as ProtoPNets. We demonstrate the effectiveness of InfoDisent through computational experiments and user studies across various datasets using modern backbones such as ViTs and convolutional networks. Notably, InfoDisent generalizes the prototypical parts approach to novel domains (ImageNet).

{\L}ukasz Struski, Dawid Rymarczyk, Jacek Tabor• 2024

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet 1k (test)	Top-1 Accuracy82.8	490
Fine-grained Image Classification	Stanford Cars	Accuracy92.9	298
Fine-grained Image Classification	CUB-200	Accuracy (All)84.1	53
Fine-grained Image Classification	Stanford Dogs	--	18
Explanation Matching	CUB-200 User Study 2011 (test)	Accuracy65	7
Explanation Matching	ImageNet User Study (test)	Accuracy59	3
Prototype Discriminability	ImageNet	Accuracy59	3

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord