Towards Reasonable Concept Bottleneck Models
About
We propose a novel, flexible, and efficient framework for designing Concept Bottleneck Models (CBMs) that enables practitioners to explicitly encode and extend their prior knowledge and beliefs about the concept-concept ($C-C$) and concept-task ($C \to Y$) relationships within the model's reasoning when making predictions. The resulting $\textbf{C}$oncept $\textbf{REA}$soning $\textbf{M}$odels (CREAMs) architecturally encode arbitrary types of $C-C$ relationships such as mutual exclusivity, hierarchical associations, and/or correlations, as well as potentially sparse $C \to Y$ relationships. Moreover, CREAM can optionally incorporate a regularized side-channel to complement the potentially {incomplete concept sets}, achieving competitive task performance while encouraging predictions to be concept-grounded. To evaluate CBMs in such settings, we introduce a $C \to Y$ agnostic metric that quantifies interpretability when predictions partially rely on the side-channel. In our experiments, we show that, without additional computational overhead, CREAM models support efficient interventions, can avoid concept leakage, and achieve black-box-level performance under missing concepts. We further analyze how an optional side-channel affects interpretability and intervenability. Importantly, the side-channel enables CBMs to remain effective even in scenarios where only a limited number of concepts are available.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Classification | CelebA | Avg Accuracy80.92 | 185 | |
| Classification | CUB | -- | 93 | |
| Image Classification | CelebA | -- | 42 | |
| Image Classification | iFMNIST | Peak Memory1 | 12 | |
| Image Classification | cFMNIST | Peak Memory Usage1 | 12 | |
| Classification | iFMNIST | Accuracy (Y)92.43 | 11 | |
| Classification | cFMNIST | Accuracy (Y)92.38 | 11 | |
| Image Classification | CelebA | Training Time1 | 8 | |
| Concept-based Learning | CelebA | Training Time1.002 | 7 | |
| Concept-based Learning | iFMNIST | Training Time1.585 | 7 |