AIセキュリティポータル K Program
An Empirical Evaluation of LLMs for Solving Offensive Security Challenges
Share
Abstract
Capture The Flag (CTF) challenges are puzzles related to computer security scenarios. With the advent of large language models (LLMs), more and more CTF participants are using LLMs to understand and solve the challenges. However, so far no work has evaluated the effectiveness of LLMs in solving CTF challenges with a fully automated workflow. We develop two CTF-solving workflows, human-in-the-loop (HITL) and fully-automated, to examine the LLMs' ability to solve a selected set of CTF challenges, prompted with information about the question. We collect human contestants' results on the same set of questions, and find that LLMs achieve higher success rate than an average human participant. This work provides a comprehensive evaluation of the capability of LLMs in solving real world CTF challenges, from real competition to fully automated workflow. Our results provide references for applying LLMs in cybersecurity education and pave the way for systematic evaluation of offensive cybersecurity capabilities in LLMs.
Analysis and exercises for engaging beginners in online {CTF} competitions for security education
BURNS, T.
Published: 2017
Capture the flag: mixed-reality social gaming with smart phones
CHEOK, A. D.
Published: 2006
Using facebook’s open source capture the flag platform as a hands-on learning and assessment tool for cybersecurity education
CHICONE, R.
Published: 2018
Learning obstacles in the capture the flag model
CHUNG, K.
Published: 2014
A nerd dogma: Introducing ctf to non-expert audience
COSTA, G.
Published: 2020
Zenhackademy: Ethical hacking @ dibris
DEMETRIO, L.
Published: 2019
A ctf-based approach in cyber security education for secondary school students
HANAFI, A. H. A.
Published: 2021
A capture the flag (ctf) platform and exercises for an intro to computer security class
KAPLAN, Z.
Published: 2022
An analysis and evaluation of open source capture the flag platforms as cybersecurity e-learning tools
KARAGIANNIS, S.
Published: 2020
Analysis and evaluation of capture the flag challenges in secure mobile application development
KARAGIANNIS, S.
Published: 2022
An empirical survey of functions and configurations of open-source capture the flag (ctf) environments
KUCEK, S.
Published: 2020
Using capture-the-flag to enhance the effectiveness of cybersecurity education
LEUNE, K.
Published: 2017
Capture the flag as cyber security introduction
MCDANIEL, L.
Published: 2016
Analysis of cyber security knowledge and skills for capture the flag competition
NELMIAWATI, N.
Published: 2022
Training language models to follow instructions with human feedback
OUYANG, L.
Published: 2022
Improving language understanding by generative pre-training
RADFORD, A.
Published: 2018
Language models are unsupervised multitask learners
RADFORD, A.
Published: 2019
Benefits and pitfalls of using capture the flag games in university courses
VYKOPAL, J.
Published: 2020
Language agents as hackers: Evaluating cybersecurity skills with capture the flag
YANG, J., PRABHAKAR, A., YAO, S., PEI, K., NARASIMHAN, K. R.
Published: 2023
Share