Oedipus: LLM-enchanced Reasoning CAPTCHA Solver | AI Security Portal

JA

JA

EN

TOP Literature Database Oedipus: LLM-enchanced Reasoning CAPTCHA Solver

arxiv

Oedipus: LLM-enchanced Reasoning CAPTCHA Solver

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2405.07496

PDF

https://arxiv.org/pdf/2405.07496

Paper Information

Author: Gelei Deng;Haoran Ou;Yi Liu;Jie Zhang;Tianwei Zhang;Yang Liu
Published: 5-13-2024
Affiliation: Nanyang Technological University
Country: Singapore
Conference: Annual ACM Conference on Computer and Communications Security (CCS)

Labels Estimated by AI

Prompt Injection CAPTCHA Solver LLM Performance Evaluation

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

CAPTCHAs have become a ubiquitous tool in safeguarding applications from automated bots. Over time, the arms race between CAPTCHA development and evasion techniques has led to increasingly sophisticated and diverse designs. The latest iteration, reasoning CAPTCHAs, exploits tasks that are intuitively simple for humans but challenging for conventional AI technologies, thereby enhancing security measures. Driven by the evolving AI capabilities, particularly the advancements in Large Language Models (LLMs), we investigate the potential of multimodal LLMs to solve modern reasoning CAPTCHAs. Our empirical analysis reveals that, despite their advanced reasoning capabilities, LLMs struggle to solve these CAPTCHAs effectively. In response, we introduce Oedipus, an innovative end-to-end framework for automated reasoning CAPTCHA solving. Central to this framework is a novel strategy that dissects the complex and human-easy-AI-hard tasks into a sequence of simpler and AI-easy steps. This is achieved through the development of a Domain Specific Language (DSL) for CAPTCHAs that guides LLMs in generating actionable sub-steps for each CAPTCHA challenge. The DSL is customized to ensure that each unit operation is a highly solvable subtask revealed in our previous empirical study. These sub-steps are then tackled sequentially using the Chain-of-Thought (CoT) methodology. Our evaluation shows that Oedipus effectively resolves the studied CAPTCHAs, achieving an average success rate of 63.5\%. Remarkably, it also shows adaptability to the most recent CAPTCHA designs introduced in late 2023, which are not included in our initial study. This prompts a discussion on future strategies for designing reasoning CAPTCHAs that can effectively counter advanced AI solutions.

External Datasets

Arkose-Angular

Geetest-Gobang

Geetest-Space

Yidun-Space-Reasoning

Arkose-FunCAPTCHA

Geetest-IconCrush

References

Moravec’s paradox

Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach

Ye, G., Tang, Z., Fang, D., Zhu, Z., Feng, Y., Xu, P., Chen, X., Wang, Z.

Published: 2018

8th USENIX Workshop on Offensive Technologies (WOOT 14)

The end is nigh: Generic solving of text-based CAPTCHAs

E. Bursztein, J. Aigrain, A. Moscicki, J. C. Mitchell

Published: 2014

How secure is your website? a comprehensive investigation on captcha providers and solving services

R. Jin, L. Huang, J. Duan, W. Zhao, Y. Liao, P. Zhou

Published: 2023

An object detection based solver for google’s image recaptcha v2

M. I. Hossen, Y. Tu, M. F. Rabby, M. N. Islam, H. Cao, X. Hei

Published: 2021

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE

A captcha design based on visual reasoning

H. Wang, F. Zheng, Z. Chen, Y. Lu, J. Gao, R. Wei

Published: 2018

30th USENIX security symposium (USENIX security 21)

Research on the security of visual reasoning CAPTCHA

Y. Gao, H. Gao, S. Luo, Y. Zi, S. Zhang, W. Mao, P. Wang, Y. Shen, J. Yan

Published: 2021

IEEE Transactions on Dependable and Secure Computing

Extended research on the security of visual reasoning captcha

P. Wang, H. Gao, C. Xiao, X. Guo, Y. Gao, Y. Zi

Published: 2023

Detrs with collaborative hybrid assignments training

Z. Zong, G. Song, Y. Liu

Published: 2022

A survey of large language models

W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong

Published: 2023

ACM Computing Surveys

Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, Graham Neubig

Published: 2023

Chain-of-thought prompting elicits reasoning in large language models

J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, D. Zhou

Published: 2023

ACM computing surveys (CSUR)

When and how to develop domain-specific languages

M. Mernik, J. Heering, A. M. Sloane

Published: 2005

Computing Research Repository (CoRR)

Deep-CAPTCHA: a deep learning based CAPTCHA solver for vulnerability assessment

Zahra Noury, Mahdi Rezaei

Published: 6.15.2020

CAPTCHA is a human-centred test to distinguish a human operator from bots, attacking programs, or other computerised agents that tries to imitate human intelligence. In this research, we investigate a way to crack visual CAPTCHA tests by an automated deep learning based solution. The goal of this research is to investigate the weaknesses and vulnerabilities of the CAPTCHA generator systems; hence, developing more robust CAPTCHAs, without taking the risks of manual try and fail efforts. We develop a Convolutional Neural Network called Deep-CAPTCHA to achieve this goal. The proposed platform is able to investigate both numerical and alphanumerical CAPTCHAs. To train and develop an efficient model, we have generated a dataset of 500,000 CAPTCHAs to train our model. In this paper, we present our customised deep neural network model, we review the research gaps, the existing challenges, and the solutions to cope with the issues. Our network's cracking accuracy leads to a high rate of 98.94% and 98.31% for the numerical and the alpha-numerical test datasets, respectively. That means more works is required to develop robust CAPTCHAs, to be non-crackable against automated artificial agents. As the outcome of this research, we identify some efficient techniques to improve the security of the CAPTCHAs, based on the performance analysis conducted on the Deep-CAPTCHA model.

Deep Learning Method Performance Evaluation Vulnerability detection

Explaining and Harnessing Adversarial Examples

I. J. Goodfellow, J. Shlens, C. Szegedy

Published: 2014

Proceedings of the 18th international conference on World wide web

What’s up captcha? a captcha based on image orientation

R. Gossweiler, M. Kamvar, S. Baluja

Published: 2009

International Journal of Computer Science and Information Technologies

Survey of different types of captcha

V. P. Singh, P. Pal

Published: 2014

19th USENIX Security Symposium (USENIX Security 10)

Re: CAPTCHAs—Understanding CAPTCHA-Solving services in an economic context

M. Motoyama, K. Levchenko, C. Kanich, D. McCoy, G. M. Voelker, S. Savage

Published: 2010

2014 9th International Workshop on Semantic and Social Media Adaptation and Personalization. IEEE

Automated captcha solving: An empirical comparison of selected techniques

M. Korakakis, E. Magkos, P. Mylonas

Published: 2014

A survey on evaluation of large language models

Y. Chang, X. Wang, J. Wang, Y. Wu, K. Zhu, H. Chen, L. Yang, X. Yi, C. Wang, Y. Wang

Published: 2023

USENIX Security Symposium

PentestGPT: An LLM-empowered Automatic Penetration Testing Tool

Gelei Deng, Yi Liu, Víctor Mayoral-Vilches, Peng Liu, Yuekang Li, Yuan Xu, Tianwei Zhang, Yang Liu, Martin Pinzger, Stefan Rass

Published: 8.13.2023

Penetration testing, a crucial industrial practice for ensuring system security, has traditionally resisted automation due to the extensive expertise required by human professionals. Large Language Models (LLMs) have shown significant advancements in various domains, and their emergent abilities suggest their potential to revolutionize industries. In this research, we evaluate the performance of LLMs on real-world penetration testing tasks using a robust benchmark created from test machines with platforms. Our findings reveal that while LLMs demonstrate proficiency in specific sub-tasks within the penetration testing process, such as using testing tools, interpreting outputs, and proposing subsequent actions, they also encounter difficulties maintaining an integrated understanding of the overall testing scenario. In response to these insights, we introduce PentestGPT, an LLM-empowered automatic penetration testing tool that leverages the abundant domain knowledge inherent in LLMs. PentestGPT is meticulously designed with three self-interacting modules, each addressing individual sub-tasks of penetration testing, to mitigate the challenges related to context loss. Our evaluation shows that PentestGPT not only outperforms LLMs with a task-completion increase of 228.6\% compared to the \gptthree model among the benchmark targets but also proves effective in tackling real-world penetration testing challenges. Having been open-sourced on GitHub, PentestGPT has garnered over 4,700 stars and fostered active community engagement, attesting to its value and impact in both the academic and industrial spheres.

Prompt Injection Penetration Testing Methods Performance Evaluation

Proceedings of the 31st Annual Network and Distributed System Security Symposium (NDSS)

Large language model guided protocol fuzzing

R. Meng, M. Mirchev, M. Böhme, A. Roychoudhury

Published: 2024

Bot or human? detecting chatgpt imposters with a single question

H. Wang, X. Luo, W. Wang, X. Yan

Published: 2023

30th USENIX Security Symposium (USENIX Security 21). USENIX Association

Research on the security of visual reasoning CAPTCHA

Y. Gao, H. Gao, S. Luo, Y. Zi, S. Zhang, W. Mao, P. Wang, Y. Shen, J. Yan

Published: 2021

Procedia Computer Science

Recent advances of captcha security analysis: a short literature review

N. T. Dinh, V. T. Hoang

Published: 2023

32nd USENIX Security Symposium (USENIX Security 23). USENIX Association

An empirical study & evaluation of modern CAPTCHAs

A. Searles, Y. Nakatsuka, E. Ozturk, A. Paverd, G. Tsudik, A. Enkoji

Published: 2023

Archives of Computational Methods in Engineering

A systematic survey on captcha recognition: types, creation and breaking techniques

M. Kumar, M. Jindal, M. Kumar

Published: 2022

32nd USENIX Security Symposium (USENIX Security 23). USENIX Association

An empirical study & evaluation of modern CAPTCHAs

A. Searles, Y. Nakatsuka, E. Ozturk, A. Paverd, G. Tsudik, A. Enkoji

Published: 2023

CAPTCHA Patents

Theoremqa: A theorem-driven question answering dataset

W. Chen, M. Yin, M. Ku, P. Lu, Y. Wan, X. Ma, J. Xu, X. Wang, T. Xia

Published: 2023

gpt-4-vision preview

gemini-pro vision

Attention is all you need

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin

Published: 2023

Chatgpt is not enough: Enhancing large language models with knowledge graphs for fact-aware language modeling

L. Yang, H. Chen, Z. Li, X. Ding, X. Wu

Published: 2023

Three-valued logic

tuned code llama

Designing human friendly human interaction proofs (hips

K. Chellapilla, K. Larson, P. Simard, M. Czerwinski

Published: 2005

Advances in Cryptology—EUROCRYPT 2003: International Conference on the Theory and Applications of Cryptographic Techniques, Warsaw, Poland, May 4–8, 2003 Proceedings 22. Springer

Captcha: Using hard ai problems for security

L. Von Ahn, M. Blum, N. J. Hopper, J. Langford

Published: 2003

RSS 2023 Workshop on Learning for Task and Motion Planning

Large language models as commonsense knowledge for large-scale task planning

Z. Zhao, W. S. Lee, D. Hsu

Published: 2023