Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

TOP 文献データベース Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2402.00689

PDF

https://arxiv.org/pdf/2402.00689

文献情報

作者: Ran Elgedawy;John Sadik;Senjuti Dutta;Anuj Gautam;Konstantinos Georgiou;Farzin Gholamrezae;Fujiao Ji;Kyungchan Lim;Qian Liu;Scott Ruoti
公開日: 2024-2-2
所属機関: University of Tennessee, Knoxville
所属の国: United States of America
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

コード生成プロンプトインジェクション LLM性能評価

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

$ $Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure code, the literature does not take into account what factors help generate secure and effective code. Therefore in this paper we focus on identifying and understanding the conditions and contexts in which LLMs can be effectively and safely deployed in real-world scenarios to generate quality code. We conducted a comparative analysis of four advanced LLMs--GPT-3.5 and GPT-4 using ChatGPT and Bard and Gemini from Google--using 9 separate tasks to assess each model's code generation capabilities. We contextualized our study to represent the typical use cases of a real-life developer employing LLMs for everyday tasks as work. Additionally, we place an emphasis on security awareness which is represented through the use of two distinct versions of our developer persona. In total, we collected 61 code outputs and analyzed them across several aspects: functionality, security, performance, complexity, and reliability. These insights are crucial for understanding the models' capabilities and limitations, guiding future development and practical applications in the field of automated code generation.

外部データセット

SecurityEval

SecuCoGen

CodeLMSec Benchmark