Concepts' Information Bottleneck Models

About

Concept Bottleneck Models (CBMs) aim to deliver interpretable predictions by routing decisions through a human-understandable concept layer, yet they often suffer reduced accuracy and concept leakage that undermines faithfulness. We introduce an explicit Information Bottleneck regularizer on the concept layer that penalizes $I(X;C)$ while preserving task-relevant information in $I(C;Y)$, encouraging minimal-sufficient concept representations. We derive two practical variants (a variational objective and an entropy-based surrogate) and integrate them into standard CBM training without architectural changes or additional supervision. Evaluated across six CBM families and three benchmarks, the IB-regularized models consistently outperform their vanilla counterparts. Information-plane analyses further corroborate the intended behavior. These results indicate that enforcing a minimal-sufficient concept bottleneck improves both predictive performance and the reliability of concept-level interventions. The proposed regularizer offers a theoretic-grounded, architecture-agnostic path to more faithful and intervenable CBMs, resolving prior evaluation inconsistencies by aligning training protocols and demonstrating robust gains across model families and datasets.

Karim Galliamov, Syed M Ahsan Kazmi, Adil Khan, Ad\'in Ram\'irez Rivera• 2026

Related benchmarks

Task	Dataset	Result
Classification	CUB	Accuracy77.8	93
Concept leakage evaluation	CUB	OIS20.85	54
Concept leakage evaluation	AWA2	OIS16.01	36
Concept leakage evaluation	aPY	OIS16.06	36
Classification	AWA2	Class Accuracy88.6	34
Classification	aPY	Class Accuracy87.9	16

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord