Traffic-MoE: A Sparse Foundation Model for Network Traffic Analysis
About
While pre-trained large models have achieved state-of-the-art performance in network traffic analysis, their prohibitive computational costs hinder deployment in real-time, throughput-sensitive network defense environments. This work bridges the gap between advanced representation learning and practical network protection by introducing Traffic-MoE, a sparse foundation model optimized for high-efficiency real-time inference. By dynamically routing traffic tokens to a small subset of specialized experts, Traffic-MoE effectively decouples model capacity from computational overhead. Extensive evaluations across three security-oriented tasks demonstrate that Traffic-MoE achieves up to a 12.38% improvement in detection performance compared to leading dense competitors. Crucially, it delivers a 91.62% increase in throughput, reduces inference latency by 47.81%, and cuts peak GPU memory consumption by 38.72%. Beyond efficiency, Traffic-MoE exhibits superior robustness against adversarial traffic shaping and maintains high detection efficacy in few-shot scenarios, establishing a new paradigm for scalable and resilient network traffic analysis.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| IoT/IoMT Attack Detection | CICIoT 2023 (test) | Mean F1 Score78.24 | 31 | |
| Network Traffic Analysis | CICIoMT Time-shift 2024 | Accuracy82.19 | 7 | |
| Network Traffic Analysis | CICIoMT2024 Proportion-shift | Accuracy97.27 | 7 | |
| Network Traffic Analysis | CICIoMT Compose-shift 2024 | Accuracy80.71 | 7 | |
| Network Traffic Analysis | CICIoT2023 Proportion-shift | Accuracy80.52 | 7 | |
| Network Traffic Analysis | CICIoT Compose-shift 2023 | Accuracy63.31 | 7 | |
| Service Classification | ISCXVPN NonVPN 2016 | ACC76.13 | 7 | |
| Service Classification | ISCXVPN 2016 (Mixed) | Accuracy76.79 | 7 | |
| Traffic Detection | ISCXTor NonTor 2016 | Accuracy98.27 | 7 | |
| Traffic Detection | ISCX Tor 2016 | Accuracy90.89 | 7 |