Exploiting Large Language Models (LLMs) through Deception Techniques and Persuasion Principles

TOP Literature Database Exploiting Large Language Models (LLMs) through Deception Techniques and Persuasion Principles

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2311.14876

PDF

https://arxiv.org/pdf/2311.14876

Paper Information

Author: Sonali Singh;Faranak Abri;Akbar Siami Namin
Published: 11-25-2023
Affiliation: Department of Computing Science, Texas Tech University
Country: United States of America
Conference: IEEE Big Data

Labels Estimated by AI

Prompt Injection Psychological Manipulation Abuse of AI Chatbots

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

With the recent advent of Large Language Models (LLMs), such as ChatGPT from OpenAI, BARD from Google, Llama2 from Meta, and Claude from Anthropic AI, gain widespread use, ensuring their security and robustness is critical. The widespread use of these language models heavily relies on their reliability and proper usage of this fascinating technology. It is crucial to thoroughly test these models to not only ensure its quality but also possible misuses of such models by potential adversaries for illegal activities such as hacking. This paper presents a novel study focusing on exploitation of such large language models against deceptive interactions. More specifically, the paper leverages widespread and borrows well-known techniques in deception theory to investigate whether these models are susceptible to deceitful interactions. This research aims not only to highlight these risks but also to pave the way for robust countermeasures that enhance the security and integrity of language models in the face of sophisticated social engineering tactics. Through systematic experiments and analysis, we assess their performance in these critical security domains. Our results demonstrate a significant finding in that these large language models are susceptible to deception and social engineering attacks.