Large Language Models (LLMs) such as ChatGPT and its competitors have caused
a revolution in natural language processing, but their capabilities also
introduce new security vulnerabilities. This survey provides a comprehensive
overview of these emerging concerns, categorizing threats into several key
areas: prompt injection and jailbreaking; adversarial attacks, including input
perturbations and data poisoning; misuse by malicious actors to generate
disinformation, phishing emails, and malware; and the worrisome risks inherent
in autonomous LLM agents. Recently, a significant focus is increasingly being
placed on the latter, exploring goal misalignment, emergent deception,
self-preservation instincts, and the potential for LLMs to develop and pursue
covert, misaligned objectives, a behavior known as scheming, which may even
persist through safety training. We summarize recent academic and industrial
studies from 2022 to 2025 that exemplify each threat, analyze proposed defenses
and their limitations, and identify open challenges in securing LLM-based
applications. We conclude by emphasizing the importance of advancing robust,
multi-layered security strategies to ensure LLMs are safe and beneficial.