In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT

TOP Literature Database In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2304.08979

PDF

https://arxiv.org/pdf/2304.08979

Paper Information

Author: Xinyue Shen;Zeyuan Chen;Michael Backes;Yang Zhang
Published: 4-18-2023
Updated: 10-5-2023
Affiliation: CISPA Helmholtz Center for Information Security
Country: Germany
Conference

Labels Estimated by AI

Prompt Injection LLM Security User Experience Evaluation

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

The way users acquire information is undergoing a paradigm shift with the advent of ChatGPT. Unlike conventional search engines, ChatGPT retrieves knowledge from the model itself and generates answers for users. ChatGPT's impressive question-answering (QA) capability has attracted more than 100 million users within a short period of time but has also raised concerns regarding its reliability. In this paper, we perform the first large-scale measurement of ChatGPT's reliability in the generic QA scenario with a carefully curated set of 5,695 questions across ten datasets and eight domains. We find that ChatGPT's reliability varies across different domains, especially underperforming in law and science questions. We also demonstrate that system roles, originally designed by OpenAI to allow users to steer ChatGPT's behavior, can impact ChatGPT's reliability in an imperceptible way. We further show that ChatGPT is vulnerable to adversarial examples, and even a single character change can negatively affect its reliability in certain cases. We believe that our study provides valuable insights into ChatGPT's reliability and underscores the need for strengthening the reliability and security of large language models (LLMs).

External Datasets

BoolQ

OpenbookQA

RACE

ARC

CommonsenseQA

SQuAD1

SQuAD2

NarrativeQA

ELI5

TruthfulQA