Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification

About

Encrypted traffic classification requires discriminative and robust traffic representation captured from content-invisible and imbalanced traffic data for accurate classification, which is challenging but indispensable to achieve network security and network management. The major limitation of existing solutions is that they highly rely on the deep features, which are overly dependent on data size and hard to generalize on unseen data. How to leverage the open-domain unlabeled traffic data to learn representation with strong generalization ability remains a key challenge. In this paper,we propose a new traffic representation model called Encrypted Traffic Bidirectional Encoder Representations from Transformer (ET-BERT), which pre-trains deep contextualized datagram-level representation from large-scale unlabeled data. The pre-trained model can be fine-tuned on a small number of task-specific labeled data and achieves state-of-the-art performance across five encrypted traffic classification tasks, remarkably pushing the F1 of ISCX-Tor to 99.2% (4.4% absolute improvement), ISCX-VPN-Service to 98.9% (5.2% absolute improvement), Cross-Platform (Android) to 92.5% (5.4% absolute improvement), CSTNET-TLS 1.3 to 97.4% (10.0% absolute improvement). Notably, we provide explanation of the empirically powerful pre-training model by analyzing the randomness of ciphers. It gives us insights in understanding the boundary of classification ability over encrypted traffic. The code is available at: https://github.com/linwhitehat/ET-BERT.

Xinjie Lin, Gang Xiong, Gaopeng Gou, Zhen Li, Junzheng Shi, Jing Yu• 2022

Related benchmarks

TaskDatasetResultRank
IoT/IoMT Attack DetectionCICIoT 2023 (test)
Mean F1 Score52.17
31
Traffic ClassificationUSTC-TFC 2016
Accuracy97.35
13
Traffic ClassificationCICIoT 2022
Accuracy94.81
13
Traffic ClassificationISCXVPN 2016
Accuracy (AC)0.89
13
Traffic ClassificationCipherSpectrum
Accuracy (AC)70.26
13
Traffic ClassificationCSTNET-TLS1.3
Accuracy50.02
13
Encrypted Traffic ClassificationISCXVPN (test)
Mean Precision61.73
11
Encrypted Traffic ClassificationISCXTor (test)
Macro Precision59.62
11
Encrypted Traffic ClassificationCHNAPP (test)
Precision (Macro)55.03
11
IoT/IoMT Attack DetectionCICIoMT 2024 (test)
Accuracy97.69
7
Showing 10 of 22 rows

Other info

Follow for update