Functorial Neural Architectures from Higher Inductive Types
About
Neural networks systematically fail at compositional generalization -- producing correct outputs for novel combinations of known parts. We show that this failure is architectural: compositional generalization is equivalent to functoriality of the decoder, and this perspective yields both guarantees and impossibility results. We compile Higher Inductive Type (HIT) specifications into neural architectures via a monoidal functor from the path groupoid of a target space to a category of parametric maps: path constructors become generator networks, composition becomes structural concatenation, and 2-cells witnessing group relations become learned natural transformations. We prove that decoders assembled by structural concatenation of independently generated segments are strict monoidal functors (compositional by construction), while softmax self-attention is not functorial for any non-trivial compositional task. Both results are formalized in Cubical Agda. Experiments on three spaces validate the full hierarchy: on the torus ($\mathbb{Z}^2$), functorial decoders outperform non-functorial ones by 2-2.7x; on $S^1 \vee S^1$ ($F_2$), the type-A/B gap widens to 5.5-10x; on the Klein bottle ($\mathbb{Z} \rtimes \mathbb{Z}$), a learned 2-cell closes a 46% error gap on words exercising the group relation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Geometric loop generation | Torus T^2 L=2 (test) | Per-segment Chamfer Distance1.68 | 5 | |
| Geometric loop generation | Torus T^2 L=4 (test) | Per-segment Chamfer Distance0.86 | 5 | |
| Geometric loop generation | Torus T^2 L=6 (test) | Per-segment Chamfer Distance0.74 | 5 | |
| Geometric loop generation | Torus T^2 L=8 (test) | Chamfer Distance (Per-segment)0.73 | 5 | |
| Geometric loop generation | Torus T^2 L=10 (test) | Per-segment Chamfer Distance0.77 | 5 | |
| Geometric path generation | Klein bottle Canonical words L=10 | Per-segment Chamfer Distance0.82 | 4 | |
| Geometric path generation | Klein bottle Non-canonical words L=10 | Chamfer Distance (Per-Segment)0.82 | 4 | |
| Geometric generation | Wedge of circles S^1 ∨ S^1 L=2 | Per-seg Chamfer Distance0.002 | 3 | |
| Geometric generation | Wedge of circles S^1 ∨ S^1 L=6 | Per-segment Chamfer Distance0.018 | 3 | |
| Geometric generation | Wedge of circles S^1 ∨ S^1 L=10 | Per-segment Chamfer Distance0.054 | 3 |