Neuro-symbolic Static Analysis with LLM-generated Vulnerability Patterns
About
In this work, we present MoCQ, a neuro-symbolic static analysis framework that leverages large language models (LLMs) to automatically generate vulnerability detection patterns. This approach combines the precision and scalability of pattern-based static analysis with the semantic understanding and automation capabilities of LLMs. MoCQ extracts the domain-specific languages for expressing vulnerability patterns and employs an iterative refinement loop with trace-driven symbolic validation that provides precise feedback for pattern correction. We evaluated MoCQ on 12 vulnerability types across four languages (C/C++, Java, PHP, JavaScript). MoCQ achieves detection performance comparable to expert-developed patterns while requiring only hours of generation versus weeks of manual effort. Notably, MoCQ uncovered 46 new vulnerability patterns that security experts had missed and discovered 25 previously unknown vulnerabilities in real-world applications. MoCQ also outperforms prior approaches with stronger analysis capabilities and broader applicability.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Vulnerability Detection | MoCQ (test) | True Positives (TP)292 | 39 | |
| Vulnerability Detection | C/C++ 138 vulnerabilities (test) | TP (True Positives)108 | 3 | |
| Vulnerability Detection | All Lang. 343 vulnerabilities (test) | True Positives (TP)269 | 2 | |
| Vulnerability Detection | C/C++ UAF 5 vulnerabilities (test) | TP4 | 2 | |
| Vulnerability Detection | PHP type 6 vulnerabilities (test) | True Positives (TP)4 | 2 | |
| Vulnerability Detection | JS proto 11 vulnerabilities (test) | TP10 | 2 |