Instruction Tuning for Secure Code Generation

About

Modern language models (LMs) have gained widespread acceptance in everyday and professional contexts, particularly in programming. An essential procedure enabling this adoption is instruction tuning, which substantially enhances LMs' practical utility by training them to follow user instructions and human preferences. However, existing instruction tuning schemes overlook a crucial aspect: the security of generated code. As a result, even the state-of-the-art instruction-tuned LMs frequently produce unsafe code, posing significant security risks. In this work, we introduce SafeCoder to address this gap. SafeCoder performs security-centric fine-tuning using a diverse and high-quality dataset that we collected using an automated pipeline. We integrate the security fine-tuning with standard instruction tuning, to facilitate a joint optimization of both security and utility. Despite its simplicity, we show that SafeCoder is effective across a variety of popular LMs and datasets. It is able to drastically improve security (by about 30%), while preserving utility.

Jingxuan He, Mark Vero, Gabriela Krasnopolska, Martin Vechev• 2024

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval+	Pass@162.07	393
Code Generation	MBPP+	Pass@154.18	238
Secure Code Generation	CWEval	pass@125.49	29
Secure Code Generation	CWEval	Functionality41.31	22
Secure Code Generation	CodeGuard+	Pass@165.62	18
Secure Code Generation	CyberSecEval SCG	Safety79.06	17
Secure Code Generation	Secure Code Average	Safety Score50.04	12

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord