Faster-GCG: Efficient Discrete Optimization Jailbreak Attacks against Aligned Large Language Models

About

Aligned Large Language Models (LLMs) have attracted significant attention for their safety, particularly in the context of jailbreak attacks that attempt to bypass guardrails via adversarial prompts. Among existing approaches, the Greedy Coordinate Gradient (GCG) attack pioneered automated jailbreaks through discrete token optimization; however, its low sample efficiency limits practical applicability. In particular, GCG requires approximately 256K evaluations per harmful behavior to achieve a satisfactory jailbreak success rate, due to the inherent difficulty of the underlying discrete optimization problem. In this work, we identify three key factors that limit the sample efficiency of GCG: inaccurate gradient-based estimation, inefficient uniform sampling, and repeated evaluation of previously explored suffixes. To address these issues, we propose Faster-GCG, a streamlined variant of GCG that incorporates distance-based regularization for improved estimation, temperature-controlled sampling for more effective exploration, and a visited-suffix marking mechanism to avoid redundant evaluations. Faster-GCG reduced the required evaluations to 32K, achieving up to an $8\times$ improvement in sampling efficiency and a $7\times$ reduction in wall-clock time compared to GCG. Under this reduced budget, Faster-GCG attained an average jailbreak success rate of 78.1\% across five aligned LLMs, and achieved 88.7\% against Qwen3.5-4B, outperforming state-of-the-art white-box jailbreak methods.

Xiao Li, Wei Zhang, Zhuhong Li, Qiongxiu Li, Shei PernChua, BingZe Lee, Jinghao Cui, Yifan Huang, Xiaolin Hu• 2024

Related benchmarks

Task	Dataset	Result
Token-forcing loss optimization	Random targets Held-out (val)	Qwen-2.5-7B Loss2.24	56
Jailbreak Attack	Llama 7b 2	ASR34.2	17
Jailbreak Attack	AdvBench	Loss0.16	16
Jailbreak Attack	JBB Qwen3-4B	Loss0.149	13
Jailbreak Attack	JBB	Llama2-7B ASR91.7	12
Jailbreak Attack	Jailbreak Evaluation Average across models	ASR78.1	10
Jailbreak Attack	JBB Llama2-7B	Loss0.106	8
Jailbreak Attack	JBB Gemma3-4B	Loss0.348	8
Jailbreak Attack	JBB Llama3.1-8B	Loss0.466	7
Transfer Jailbreak Attack	JBB Target: Gemini-3-flash	ASR5.6	2

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord