Can LLMs be Scammed? A Baseline Measurement Study

TOP Literature Database Can LLMs be Scammed? A Baseline Measurement Study

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2410.13893

PDF

https://arxiv.org/pdf/2410.13893

Paper Information

Author: Udari Madhushani Sehwag;Kelly Patel;Francesca Mosca;Vineeth Ravi;Jessica Staddon
Published: 10-14-2024
Affiliation: JPMorgan AI Research
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

LLM Performance Evaluation Prompt Injection Evaluation Method

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Despite the importance of developing generative AI models that can effectively resist scams, current literature lacks a structured framework for evaluating their vulnerability to such threats. In this work, we address this gap by constructing a benchmark based on the FINRA taxonomy and systematically assessing Large Language Models' (LLMs') vulnerability to a variety of scam tactics. First, we incorporate 37 well-defined base scam scenarios reflecting the diverse scam categories identified by FINRA taxonomy, providing a focused evaluation of LLMs' scam detection capabilities. Second, we utilize representative proprietary (GPT-3.5, GPT-4) and open-source (Llama) models to analyze their performance in scam detection. Third, our research provides critical insights into which scam tactics are most effective against LLMs and how varying persona traits and persuasive techniques influence these vulnerabilities. We reveal distinct susceptibility patterns across different models and scenarios, underscoring the need for targeted enhancements in LLM design and deployment.

External Datasets

37 baseline scam scenarios based on the FINRA taxonomy