Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives

About

This paper provides a systematic analysis of the opportunities, challenges, and potential solutions of harnessing Large Language Models (LLMs) such as GPT-4 to dig out vulnerabilities within smart contracts based on our ongoing research. For the task of smart contract vulnerability detection, achieving practical usability hinges on identifying as many true vulnerabilities as possible while minimizing the number of false positives. Nonetheless, our empirical study reveals contradictory yet interesting findings: generating more answers with higher randomness largely boosts the likelihood of producing a correct answer but inevitably leads to a higher number of false positives. To mitigate this tension, we propose an adversarial framework dubbed GPTLens that breaks the conventional one-stage detection into two synergistic stages $-$ generation and discrimination, for progressive detection and refinement, wherein the LLM plays dual roles, i.e., auditor and critic, respectively. The goal of auditor is to yield a broad spectrum of vulnerabilities with the hope of encompassing the correct answer, whereas the goal of critic that evaluates the validity of identified vulnerabilities is to minimize the number of false positives. Experimental results and illustrative examples demonstrate that auditor and critic work together harmoniously to yield pronounced improvements over the conventional one-stage detection. GPTLens is intuitive, strategic, and entirely LLM-driven without relying on specialist expertise in smart contracts, showcasing its methodical generality and potential to detect a broad spectrum of vulnerabilities. Our code is available at: https://github.com/git-disl/GPTLens.

Sihao Hu, Tiansheng Huang, Fatih \.Ilhan, Selim Furkan Tekin, Ling Liu• 2023

Related benchmarks

Task	Dataset	Result
Vulnerability Detection	PrimeVul (test)	F1 Score57.63	38
Vulnerability Detection	PrimeVul Paired (test)	Pair-Correct Count44	22
Bad Practice Detection	Smart Contract Bad Practices SWC-107	Accuracy71	8
Bad Practice Detection	Smart Contract Bad Practices SWC-101	Accuracy56.25	8
Bad Practice Detection	Smart Contract Bad Practices SWC-104	Accuracy69	7
Access Control (AC) Vulnerability Detection	I1 Baseline	Precision20.9	7
Reentrancy Detection	I1 Baseline	Precision75.5	7
Timestamp Dependency Detection	I1 Baseline	Precision66.2	7
Bad Practice Detection	Smart Contract Bad Practices SWC-116	Accuracy72.75	7
Arithmetic Errors (AE) Vulnerability Detection	I1 Baseline	Precision62.5	6

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord