Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM-AUG: Robust Wireless Data Augmentation with In-Context Learning in Large Language Models

About

Data scarcity remains a fundamental bottleneck in applying deep learning to wireless communication problems, particularly in scenarios where collecting labeled Radio Frequency (RF) data is expensive, time-consuming, or operationally constrained. This paper proposes LLM-AUG, a data augmentation framework that leverages in-context learning in large language models (LLMs) to generate synthetic training samples directly in a learned embedding space. Unlike conventional generative approaches that require training task-specific models, LLM-AUG performs data generation through structured prompting, enabling rapid adaptation in low-shot regimes. We evaluate LLM-AUG on two representative tasks: modulation classification and interference classification using the RadioML 2016.10A dataset, and the Interference Classification (IC) dataset respectively. Results show that LLM-AUG consistently outperforms traditional augmentation and deep generative baselines across low-shot settings and reaches near oracle performance using only 15% labeled data. LLM-AUG further demonstrates improved robustness under distribution shifts, yielding a 29.4% relative gain over diffusion-based augmentation at a lower SNR value. On the RadioML and IC datasets, LLM-AUG yields a relative gain of 67.6% and 35.7% over the diffusion-based baseline. The t-SNE visualizations further validate that synthetic samples generated by better preserve class structure in the embedding space, leading to more consistent and informative augmentations. These results demonstrate that LLMs can serve as effective and practical data augmenters for wireless machine learning, enabling robust and data-efficient learning in evolving wireless environments.

Pranshav Gajjar, Manan Tiwari, Sayanta Seth, Vijay K. Shah• 2026

Related benchmarks

TaskDatasetResultRank
Interference ClassificationIC dataset 25 s/cls
F1 Score79.9
14
Interference ClassificationIC dataset 50 s/cls
F1 Score84.8
14
Modulation ClassificationRadioML 10 s/cls 2016.10A
F1 Score37
14
Modulation ClassificationRadioML 25 s/cls 2016.10A
F1 Score48.3
14
Modulation ClassificationRadioML 50 s/cls 2016.10A
F1 Score48.9
14
Interference ClassificationIC dataset 10 s/cls
F1 Score52.7
14
Showing 6 of 6 rows

Other info

Follow for update