ProphetFuzz: Fully Automated Prediction and Fuzzing of High-Risk Option Combinations with Only Documentation via Large Language Model
About
Vulnerabilities related to option combinations pose a significant challenge in software security testing due to their vast search space. Previous research primarily addressed this challenge through mutation or filtering techniques, which inefficiently treated all option combinations as having equal potential for vulnerabilities, thus wasting considerable time on non-vulnerable targets and resulting in low testing efficiency. In this paper, we utilize carefully designed prompt engineering to drive the large language model (LLM) to predict high-risk option combinations (i.e., more likely to contain vulnerabilities) and perform fuzz testing automatically without human intervention. We developed a tool called ProphetFuzz and evaluated it on a dataset comprising 52 programs collected from three related studies. The entire experiment consumed 10.44 CPU years. ProphetFuzz successfully predicted 1748 high-risk option combinations at an average cost of only \$8.69 per program. Results show that after 72 hours of fuzzing, ProphetFuzz discovered 364 unique vulnerabilities associated with 12.30\% of the predicted high-risk option combinations, which was 32.85\% higher than that found by state-of-the-art in the same timeframe. Additionally, using ProphetFuzz, we conducted persistent fuzzing on the latest versions of these programs, uncovering 140 vulnerabilities, with 93 confirmed by developers and 21 awarded CVE numbers.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Edge Coverage | avconv POWER program | Edge Coverage1.85e+4 | 7 | |
| Edge Coverage | ffmpeg POWER program | Edge Coverage2.28e+4 | 7 | |
| Edge Coverage | tiff2pdf POWER program | Edge Coverage3.79e+3 | 7 | |
| Edge Coverage | tiff2ps POWER program | Edge Coverage2.70e+3 | 7 | |
| Vulnerability Detection | POWER and CarpetFuzz 24-hour runs | Total Vulnerabilities7 | 7 | |
| Edge Coverage | cjpeg POWER program | Edge Coverage1.13e+3 | 7 | |
| Edge Coverage | gm POWER program | Edge Coverage5.47e+3 | 7 | |
| Edge Coverage | gs POWER program | Edge Coverage1.53e+4 | 7 | |
| Edge Coverage | objdump POWER program | Edge Coverage7.59e+3 | 7 | |
| Edge Coverage | readelf POWER program | Edge Coverage Count5.91e+3 | 7 |