Quantifying and Optimizing Simplicity via Polynomial Representations
About
Deep networks often exhibit a preference for "simple" solutions, and such a simplicity bias is widely believed to play a key role in generalization. Yet a broadly applicable, quantitative measure of simplicity remains elusive. We introduce polynomial representations as a distribution-aware, low-dimensional surrogate for neural functions: we approximate a network's predictive behavior along data-dependent interpolation paths using orthogonal polynomial bases, yielding a compact functional representation. We show that the effective degree of this representation serves as a practical simplicity metric that is predictive of generalization across tasks and architectures, and consistently outperforms existing generalization proxies such as sharpness. Finally, polynomial representations naturally yield a differentiable simplicity regularizer, which consistently improves generalization in image and text classification, fine-tuning contrastive vision-language models, and reinforcement learning.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | ImageNet (val) | Top-1 Accuracy82.19 | 163 | |
| Natural Language Understanding | GLUE (test dev) | MRPC Accuracy87.66 | 90 | |
| Image Classification | ImageNet OOD Suite (test) | Accuracy (ImageNet-V2)72.04 | 4 | |
| Image Classification | ImageNet (test) | Top-1 Accuracy75.01 | 4 |