Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Topological derivative approach for deep neural network architecture adaptation

About

This work presents a novel algorithm for progressively adapting neural network architecture along the depth. In particular, we attempt to address the following questions in a mathematically principled way: i) Where to add a new capacity (layer) during the training process? ii) How to initialize the new capacity? At the heart of our approach are two key ingredients: i) the introduction of a ``shape functional" to be minimized, which depends on neural network topology, and ii) the introduction of a topological derivative of the shape functional with respect to the neural network topology. Using an optimal control viewpoint, we show that the network topological derivative exists under certain conditions, and its closed-form expression is derived. In particular, we explore, for the first time, the connection between the topological derivative from a topology optimization framework with the Hamiltonian from optimal control theory. Further, we show that the optimality condition for the shape functional leads to an eigenvalue problem for deep neural architecture adaptation. Our approach thus determines the most sensitive location along the depth where a new layer needs to be inserted during the training phase and the associated parametric initialization for the newly added layer. We also demonstrate that our layer insertion strategy can be derived from an optimal transport viewpoint as a solution to maximizing a topological derivative in $p$-Wasserstein space, where $p>= 1$. Numerical investigations with fully connected network, convolutional neural network, and vision transformer on various regression and classification problems demonstrate that our proposed approach can outperform an ad-hoc baseline network and other architecture adaptation strategies. Further, we also demonstrate other applications of topological derivative in fields such as transfer learning.

C G Krishnanunni, Tan Bui-Thanh, Clint Dawson• 2025

Related benchmarks

TaskDatasetResultRank
RegressionCalifornia Housing
MSE0.384
71
Image ClassificationMNIST (train)
Train Accuracy98.99
53
2D heat conductivity inversion2D heat equation S=1000
Best Relative Error0.327
9
2D heat conductivity inversion2D heat equation S=1500
Best Relative Error32.1
9
Image ClassificationMNIST S=600 (train)
Best Accuracy88.44
7
Inverse ProblemNavier-Stokes equation S=250 1.0 (test)
Best Error0.295
7
Wind velocity reconstructionWind velocity reconstruction S = 1000
Best Error12.4
7
Wind velocity reconstructionWind velocity reconstruction S = 5000
Best Error3.77
7
Inverse ProblemNavier-Stokes equation S=500 1.0 (test)
Minimum Error0.271
7
Heat equation parameter-to-observable mappingHeat equation parameter-to-observable map 1000 samples (test)
MSE2.576
5
Showing 10 of 10 rows

Other info

Follow for update