Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study

RFC

Internet security glossary, version 2

R. Shirey

Published: 2007

NDSS

Firmalice-automatic detection of authentication bypass vulnerabilities in binary firmware

Y. Shoshitaishvili, R. Wang, C. Hauser, C. Kruegel, G. Vigna

Published: 2015

ACM Computing Surveys (CSUR)

Dynamic malware analysis in the modern era—a state of the art survey

O. Or-Meir, N. Nissim, Y. Elovici, L. Rokach

Published: 2019

Information sheet on software assurance

Cybersecurity and Infrastructure Security Agency (CISA)

Published: 2021

The gitlab 2022 global devsecops survey

GitLab

Published: 2022

Cybersecurity

Sifu - a Cybersecurity Awareness Platform With Challenge Assessment and Intelligent Coach

T. Espinha Gasiba, U. Lechner, M. Pinto-Albuquerque

Published: 2020

IBM Systems Journal

A survey of static analysis methods for identifying security vulnerabilities in software systems

M. Pistoia, S. Chandra, S. J. Fink, E. Yahav

Published: 2007

2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE)

A comparative study of automatic program repair techniques for security vulnerabilities

E. Pinconschi, R. Abreu, P. Adao

Published: 2021

Annual Computer Security Applications Conference (ACSAC)

Transformer-based language models for software vulnerability detection

C. Thapa, S. I. Jang, M. E. Ahmed, S. Camtepe, J. Pieprzyk, S. Nepal

Published: 2022

Proc. CANAI

Low level source code vulnerability detection using advanced bert language model

Mansour Alqarni, Akramul Azim

Published: 2022

arxiv

被引用数 1

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection

Yizheng Chen, Zhoujie Ding, Lamya Alowain, Xinyun Chen, David Wagner

Published: 2023.4.2

We propose and release a new vulnerable source code dataset. We curate the dataset by crawling security issue websites, extracting vulnerability-fixing commits and source codes from the corresponding projects. Our new dataset contains 18,945 vulnerable functions spanning 150 CWEs and 330,492 non-vulnerable functions extracted from 7,514 commits. Our dataset covers 295 more projects than all previous datasets combined. Combining our new dataset with previous datasets, we present an analysis of the challenges and promising research directions of using deep learning for detecting software vulnerabilities. We study 11 model architectures belonging to 4 families. Our results show that deep learning is still not ready for vulnerability detection, due to high false positive rate, low F1 score, and difficulty of detecting hard CWEs. In particular, we demonstrate an important generalization challenge for the deployment of deep learning-based models. We show that increasing the volume of training data may not further improve the performance of deep learning models for vulnerability detection, but might be useful to improve the generalization ability to unseen projects. We also identify hopeful future research directions. We demonstrate that large language models (LLMs) are a promising research direction for ML-based vulnerability detection, outperforming Graph Neural Networks (GNNs) with code-structure features in our experiments. Moreover, developing source code specific pre-training objectives is a promising research direction to improve the vulnerability detection performance.

脆弱性検出プロンプトインジェクションセキュリティラベル

ICSE

Why don’t software developers use static analysis tools to find bugs?

B. Johnson, Y. Song, E. Murphy-Hill, R. Bowdidge

Published: 2013

EDOC ’20

Continuous Security Testing: A Case Study on Integrating Dynamic Security Testing Tools in CI/CD Pipelines

T. Rangnau, R. V. Buijtenen, F. Fransen, F. Turkmen

Published: 2020

4th International Computer Programming Education Conference (ICPEC 2023)

I’m Sorry Dave, I’m Afraid I Can’t Fix Your Code: On ChatGPT, CyberSecurity, and Secure Coding

T. Espinha Gasiba, K. Oguzhan, I. Kessba, U. Lechner, M. Pinto-Albuquerque

Published: 2023

ICMLA

Automated vulnerability detection in source code using deep representation learning

Rebecca L. Russell, Louis Y. Kim, Lei H. Hamilton, Tomo Lazovich, Jacob Harer, Onur Ozdemir, Paul M. Ellingwood, Marc W. McConley

Published: 2018

Journal of Network and Computer Applications

The rise of software vulnerability: Taxonomy of software vulnerabilities detection and machine learning approaches

H. Hanif, M. H. N. Md Nasir, M. F. Ab Razak, A. Firdaus, N. B. Anuar

Published: 2021

IEEE Trans. Software Eng.

Automatic feature learning for predicting vulnerable software components

Hoa Khanh Dam, Truyen Tran, Trang Pham, Shien Wee Ng, John Grundy, Aditya Ghose

Published: 2021

Advances in Neural Information Processing Systems

Attention is all you need

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, I. Polosukhin

Published: 2017

Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results

Large language model for vulnerability detection: Emerging results and future directions

Xin Zhou, Ting Zhang, David Lo

Published: 2024

ESEC/FSE 2023

Comparison and evaluation on static application security testing (sast) tools for java

K. Li, S. Chen, L. Fan, R. Feng, H. Liu, C. Liu, Y. Liu, Y. Chen

Published: 2023

Proceedings of the 17th International Conference on Mining Software Repositories

AC/C++ code vulnerability dataset with code changes and CVE summaries

Jiahao Fan, Yi Li, Shaohua Wang, Tien N Nguyen

Published: 2020

arxiv

被引用数 1

Annual ACM Conference on Computer and Communications Security (CCS)

Large Language Models for Code: Security Hardening and Adversarial Testing

Jingxuan He, Martin Vechev

Published: 2023.2.11

Large language models (large LMs) are increasingly trained on massive codebases and used to generate code. However, LMs lack awareness of security and are found to frequently produce unsafe code. This work studies the security of LMs along two important axes: (i) security hardening, which aims to enhance LMs' reliability in generating secure code, and (ii) adversarial testing, which seeks to evaluate LMs' security at an adversarial standpoint. We address both of these by formulating a new security task called controlled code generation. The task is parametric and takes as input a binary property to guide the LM to generate secure or unsafe code, while preserving the LM's capability of generating functionally correct code. We propose a novel learning-based approach called SVEN to solve this task. SVEN leverages property-specific continuous vectors to guide program generation towards the given property, without modifying the LM's weights. Our training procedure optimizes these continuous vectors by enforcing specialized loss terms on different regions of code, using a high-quality dataset carefully curated by us. Our extensive evaluation shows that SVEN is highly effective in achieving strong security control. For instance, a state-of-the-art CodeGen LM with 2.7B parameters generates secure code for 59.1% of the time. When we employ SVEN to perform security hardening (or adversarial testing) on this LM, the ratio is significantly boosted to 92.3% (or degraded to 36.8%). Importantly, SVEN closely matches the original LMs in functional correctness.

プロンプトインジェクション脆弱性分析セキュリティ保証

arxiv

被引用数 1

Conference on Neural Information Processing Systems (NeurIPS)

Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks

Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, Yang Liu

Published: 2019.9.9

Vulnerability identification is crucial to protect the software systems from attacks for cyber security. It is especially important to localize the vulnerable functions among the source code to facilitate the fix. However, it is a challenging and tedious process, and also requires specialized security expertise. Inspired by the work on manually-defined patterns of vulnerabilities from various code representation graphs and the recent advance on graph neural networks, we propose Devign, a general graph neural network based model for graph-level classification through learning on a rich set of code semantic representations. It includes a novel Conv module to efficiently extract useful features in the learned rich node representations for graph-level classification. The model is trained over manually labeled datasets built on 4 diversified large-scale open-source C projects that incorporate high complexity and variety of real source code instead of synthesis code used in previous works. The results of the extensive evaluation on the datasets demonstrate that Devign outperforms the state of the arts significantly with an average of 10.51% higher accuracy and 8.68\% F1 score, increases averagely 4.66% accuracy and 6.37% F1 by the Conv module.

データ駆動型脆弱性評価グラフ構築機械学習

ICSE-SEIP

D2A: A dataset built for ai-based vulnerability detection methods using differential analysis

Yunhui Zheng, Saurabh Pujar, Burn Lewis, et al.

Published: 2021

2022 cwe top 25 most dangerous software weaknesses

The MITRE Corporation (MITRE)

Vulnerability detection with code language models: How far are we?

Y. Ding, Y. Fu, O. Ibrahim, C. Sitawarin, X. Chen, B. Alomair, D. Wagner, B. Ray, Y. Chen

Published: 2024

ICSE

Data quality for software vulnerability datasets

Roland Croft, M Ali Babar, M Mehdi Kholoosi

Published: 2023

Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE 2021)

CVEfixes: automated collection of vulnerabilities and their fixes from open-source software

Guru Bhandari, Amara Naseer, Leon Moonen

Published: 2021

RAISE 2019

Challenging machine learning algorithms in predicting vulnerable javascript functions

R. Ferenc, P. Hegedus, P. Gyimesi, G. Antal, D. B ˝ an, T. Gyimothy

Published: 2019

Code llama: Open foundation models for code

B. R. et al.

Published: 2024

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Gemini-Team

Published: 2024

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Transformers: State-of-the-art natural language processing

T. W. et al.

Published: 2020

How is chatgpt’s behavior changing over time?

L. Chen, M. Zaharia, J. Zou

Published: 2023

Common weakness enumeration (cwe)

MITRE

Published: 2024

The annals of mathematical statistics

On a test of whether one of two random variables is stochastically larger than the other

H. B. Mann, D. R. Whitney

Published: 1947

Psychometrika

Rank-Biserial Correlation

E. E. Cureton

Published: 1956

Advances in Computers

Security Testing: A Survey

M. Felderer, M. Buchler, M. Johns, A. D. Brucker, R. Breu, A. Pretschner

Published: 2016

Empirical Software Engineering

Can traditional fault prediction models be used for vulnerability prediction?

Y. Shin, L. Williams

Published: 2013

Deep learning based vulnerability detection: Are we there yet?

S. Chakraborty, R. Krishna, Y. Ding, B. Ray

Published: 2020

Vuldeepecker: A deep learning-based system for vulnerability detection

Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, Yuyi Zhong

Published: 2018

Code summarization with structure-induced transformer

H. Wu, H. Zhao, M. Zhang

Published: 2021

Evaluation of chatgpt model for vulnerability detection

A. Cheshkov, P. Zadorozhny, R. Levichev

Published: 2023

Future Internet

A new approach to web application security: Utilizing gpt language models for source code inspection

Z. Szabo, V. Bilicki

Published: 2023

Understanding the effectiveness of large language models in detecting security vulnerabilities

Avishree Khare, Saikat Dutta, Ziyang Li, Alaia Solko-Breslin, Rajeev Alur, Mayur Naik

Published: 2023

A case study of large language models (chatgpt and codebert) for security-oriented code analysis

Z. Wang, L. Zhang, C. Cao, N. Luo, P. Liu

Published: 2024

Transformer-based vulnerability detection in code at edittime: Zero-shot, few-shot, or fine-tuning?

Aaron Chan, Anant Kharkar, Roshanak Zilouchian others Moghaddam

Published: 2023

Aibughunter: A practical tool for predicting, classifying and repairing software vulnerabilities

M. Fu, C. Tantithamthavorn, T. Le, Y. Kume, V. Nguyen, D. Phung, J. Grundy

Published: 2023