NetGPT: Generative Pretrained Transformer for Network Traffic

About

All data on the Internet are transferred by network traffic, thus accurately modeling network traffic can help improve network services quality and protect data privacy. Pretrained models for network traffic can utilize large-scale raw data to learn the essential characteristics of network traffic, and generate distinguishable results for input traffic without considering specific downstream tasks. Effective pretrained models can significantly optimize the training efficiency and effectiveness of downstream tasks, such as application classification, attack detection and traffic generation. Despite the great success of pretraining in natural language processing, there is no work in the network field. Considering the diverse demands and characteristics of network traffic and network tasks, it is non-trivial to build a pretrained model for network traffic and we face various challenges, especially the heterogeneous headers and payloads in the multi-pattern network traffic and the different dependencies for contexts of diverse downstream network tasks. To tackle these challenges, in this paper, we make the first attempt to provide a generative pretrained model NetGPT for both traffic understanding and generation tasks. We propose the multi-pattern network traffic modeling to construct unified text inputs and support both traffic understanding and generation tasks. We further optimize the adaptation effect of the pretrained model to diversified tasks by shuffling header fields, segmenting packets in flows, and incorporating diverse task labels with prompts. With diverse traffic datasets from encrypted software, DNS, private industrial protocols and cryptocurrency mining, expensive experiments demonstrate the effectiveness of our NetGPT in a range of traffic understanding and generation tasks on traffic datasets, and outperform state-of-the-art baselines by a wide margin.

Xuying Meng, Chungang Lin, Yequan Wang, Yujun Zhang• 2023

Related benchmarks

Task	Dataset	Result
IoT/IoMT Attack Detection	CICIoT 2023 (test)	Mean F1 Score69.62	31
Service Classification	ISCXVPN NonVPN 2016	ACC66.18	7
Service Classification	ISCXVPN 2016 (Mixed)	Accuracy69.73	7
Traffic Detection	ISCXTor NonTor 2016	Accuracy96.55	7
IoT/IoMT Attack Detection	CICIoMT 2024 (test)	Accuracy89.58	7
Traffic Detection	ISCX Tor 2016	Accuracy85.06	7
Network Traffic Analysis	CICIoMT2024 Proportion-shift	Accuracy87.69	7
Network Traffic Analysis	CICIoT2023 Time-shift	Accuracy38.54	7
Network Traffic Analysis	CICIoT2023 Proportion-shift	Accuracy75.35	7
Network Traffic Analysis	CICIoMT Time-shift 2024	Accuracy49.42	7

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord