Independent Prototype Propagation for Zero-Shot Compositionality
About
Humans are good at compositional zero-shot reasoning; someone who has never seen a zebra before could nevertheless recognize one when we tell them it looks like a horse with black and white stripes. Machine learning systems, on the other hand, usually leverage spurious correlations in the training data, and while such correlations can help recognize objects in context, they hurt generalization. To be able to deal with underspecified datasets while still leveraging contextual clues during classification, we propose ProtoProp, a novel prototype propagation graph method. First we learn prototypical representations of objects (e.g., zebra) that are conditionally independent w.r.t. their attribute labels (e.g., stripes) and vice versa. Next we propagate the independent prototypes through a compositional graph, to learn compositional prototypes of novel attribute-object combinations that reflect the dependencies of the target distribution. The method does not rely on any external data, such as class hierarchy graphs or pretrained word embeddings. We evaluate our approach on AO-Clever, a synthetic and strongly visual dataset with clean labels, and UT-Zappos, a noisy real-world dataset of fine-grained shoe types. We show that in the generalized compositional zero-shot setting we outperform state-of-the-art results, and through ablations we show the importance of each part of the method and their contribution to the final results.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Generalized Compositional Zero-Shot Learning | C-GQA (test) | AUC3.7 | 46 | |
| Compositional Zero-Shot Learning | UT-Zappos Closed World | HM50.2 | 42 | |
| Compositional Zero-Shot Learning | C-GQA Closed World | HM15.1 | 41 | |
| Generalized Compositional Zero-Shot Learning | AO-Clevr (2:8) | Accuracy (Seen)98.6 | 4 | |
| Generalized Compositional Zero-Shot Learning | AO-Clevr 4:6 | Seen Accuracy97.9 | 4 | |
| Generalized Compositional Zero-Shot Learning | AO-Clevr (5:5) | Seen Accuracy96.7 | 4 | |
| Generalized Compositional Zero-Shot Learning | AO-Clevr (6:4) | Seen Accuracy95.6 | 4 | |
| Generalized Compositional Zero-Shot Learning | AO-Clevr (7:3) | Seen Score0.915 | 4 | |
| Generalized Compositional Zero-Shot Learning | AO-Clevr 3:7 | Seen Score96.3 | 4 | |
| Compositional Zero-Shot Learning | UT-Zappos | -- | 3 |