Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

About

LLMs are increasingly used for code generation, but their outputs often follow recurring templates that can induce predictable vulnerabilities. We study \emph{vulnerability persistence} in LLM-generated software and introduce \emph{Feature--Security Table (FSTab)} with two components. First, FSTab enables a black-box attack that predicts likely backend vulnerabilities from observable frontend features and knowledge of the source LLM, without access to backend code or source code. Second, FSTab provides a model-centric evaluation that quantifies how consistently a given model reproduces the same vulnerabilities across programs, semantics-preserving rephrasings, and application domains. We evaluate FSTab on state-of-the-art code LLMs, including GPT-5.2, Claude-4.5 Opus, and Gemini-3 Pro, across diverse application domains. Our results show strong cross-domain transfer: even when the target domain is excluded from training, FSTab achieves up to 94\% attack success and 93\% vulnerability coverage on Internal Tools (Claude-4.5 Opus). These findings expose an underexplored attack surface in LLM-generated software and highlight the security risks of code generation. Our code is available at: https://anonymous.4open.science/r/FSTab-024E.

Tomer Kordonsky, Maayan Yamin, Noam Benzimra, Amit LeVi, Avi Mendelson• 2026

Related benchmarks

TaskDatasetResultRank
Vulnerability Attack AnalysisWebGenBench E-commerce target-domain and cross-domain 1.0--
12
Vulnerability Attack AnalysisWebGenBench Internal Tools target-domain and cross-domain 1.0--
12
Vulnerability Attack AnalysisWebGenBench Social Media target-domain and cross-domain 1.0--
12
Vulnerability Attack AnalysisWebGenBench Blogging target-domain and cross-domain 1.0--
12
Vulnerability Attack AnalysisWebGenBench Dashboards target-domain and cross-domain 1.0--
12
Code-LLM Vulnerability Recurrence EvaluationFSTab LLM-generated software--
6
Vulnerability Attack PerformanceE2E (dev)--
6
Vulnerability Attack PerformanceE2E Cross-domain (dev)--
6
Showing 8 of 8 rows

Other info

Follow for update