ReLU-KAN: New Kolmogorov-Arnold Networks that Only Need Matrix Addition, Dot Multiplication, and ReLU

About

Limited by the complexity of basis function (B-spline) calculations, Kolmogorov-Arnold Networks (KAN) suffer from restricted parallel computing capability on GPUs. This paper proposes a novel ReLU-KAN implementation that inherits the core idea of KAN. By adopting ReLU (Rectified Linear Unit) and point-wise multiplication, we simplify the design of KAN's basis function and optimize the computation process for efficient CUDA computing. The proposed ReLU-KAN architecture can be readily implemented on existing deep learning frameworks (e.g., PyTorch) for both inference and training. Experimental results demonstrate that ReLU-KAN achieves a 20x speedup compared to traditional KAN with 4-layer networks. Furthermore, ReLU-KAN exhibits a more stable training process with superior fitting ability while preserving the "catastrophic forgetting avoidance" property of KAN. You can get the code in https://github.com/quiqi/relu_kan

Qi Qiu, Tao Zhu, Helin Gong, Liming Chen, Huansheng Ning• 2024

Related benchmarks

Task	Dataset	Result	Rank
Function Approximation	Function Approximation [np, 6, 1]	Time (sec/iter)0.0108		14
Physics-Informed PDE Solving	2D Helmholtz (test)	Time (sec/iter)0.153		14

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord