These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The potential misuse of ChatGPT and other Large Language Models (LLMs) has
raised concerns regarding the dissemination of false information, plagiarism,
academic dishonesty, and fraudulent activities. Consequently, distinguishing
between AI-generated and human-generated content has emerged as an intriguing
research topic. However, current text detection methods lack precision and are
often restricted to specific tasks or domains, making them inadequate for
identifying content generated by ChatGPT. In this paper, we propose an
effective ChatGPT detector named DEMASQ, which accurately identifies
ChatGPT-generated content. Our method addresses two critical factors: (i) the
distinct biases in text composition observed in human- and machine-generated
content and (ii) the alterations made by humans to evade previous detection
methods. DEMASQ is an energy-based detection model that incorporates novel
aspects, such as (i) optimization inspired by the Doppler effect to capture the
interdependence between input text embeddings and output labels, and (ii) the
use of explainable AI techniques to generate diverse perturbations. To evaluate
our detector, we create a benchmark dataset comprising a mixture of prompts
from both ChatGPT and humans, encompassing domains such as medical, open Q&A,
finance, wiki, and Reddit. Our evaluation demonstrates that DEMASQ achieves
high accuracy in identifying content generated by ChatGPT.