Chimera: Harnessing Multi-Agent LLMs for Automatic Insider Threat Simulation

TOP Literature Database Chimera: Harnessing Multi-Agent LLMs for Automatic Insider Threat Simulation

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2508.07745

PDF

https://arxiv.org/pdf/2508.07745

Paper Information

Author: Jiongchi Yu,Xiaofei Xie,Qiang Hu,Yuhan Ma,Ziming Zhao
Published: 8-11-2025
Updated: 8-12-2025
Affiliation: Singapore Management University
Country: Singapore
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Data Collection Indirect Prompt Injection User Behavior Analysis

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Insider threats, which can lead to severe losses, remain a major security concern. While machine learning-based insider threat detection (ITD) methods have shown promising results, their progress is hindered by the scarcity of high-quality data. Enterprise data is sensitive and rarely accessible, while publicly available datasets, when limited in scale due to cost, lack sufficient real-world coverage; and when purely synthetic, they fail to capture rich semantics and realistic user behavior. To address this, we propose Chimera, the first large language model (LLM)-based multi-agent framework that automatically simulates both benign and malicious insider activities and collects diverse logs across diverse enterprise environments. Chimera models each employee with agents that have role-specific behavior and integrates modules for group meetings, pairwise interactions, and autonomous scheduling, capturing realistic organizational dynamics. It incorporates 15 types of insider attacks (e.g., IP theft, system sabotage) and has been deployed to simulate activities in three sensitive domains: technology company, finance corporation, and medical institution, producing a new dataset, ChimeraLog. We assess ChimeraLog via human studies and quantitative analysis, confirming its diversity, realism, and presence of explainable threat patterns. Evaluations of existing ITD methods show an average F1-score of 0.83, which is significantly lower than 0.99 on the CERT dataset, demonstrating ChimeraLog's higher difficulty and utility for advancing ITD research.

External Datasets

ChimeraLog

CERT