A Systematic Evaluation of Parameter-Efficient Fine-Tuning Methods for the Security of Code LLMs

Authors: Kiho Lee, Jungkon Kim, Doowon Kim, Hyoungshick Kim | Published: 2025-09-16

2025.09.162025.09.18

Authors: Kiho Lee, Jungkon Kim, Doowon Kim, Hyoungshick Kim
Published: 2025-09-16

Source: https://arxiv.org/abs/2509.12649

PDF: https://arxiv.org/pdf/2509.12649

Labels Predicted by AI

Backdoor Detection

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

Code-generating Large Language Models (LLMs) significantly accelerate software development. However, their frequent generation of insecure code presents serious risks. We present a comprehensive evaluation of seven parameter-efficient fine-tuning (PEFT) techniques, demonstrating substantial gains in secure code generation without compromising functionality. Our research identifies prompt-tuning as the most effective PEFT method, achieving an 80.86 the 67.28 temperature further elevated security to 87.65 approximately 203,700 vulnerable code snippets per million generated. Moreover, prompt and prefix tuning increase robustness against poisoning attacks in our TrojanPuzzle evaluation, with strong performance against CWE-79 and CWE-502 attack vectors. Our findings generalize across Python and Java, confirming prompt-tuning’s consistent effectiveness. This study provides essential insights and practical guidance for building more resilient software systems with LLMs.