Log-based insider threat detection (ITD) detects malicious user activities by
auditing log entries. Recently, large language models (LLMs) with strong common
sense knowledge have emerged in the domain of ITD. Nevertheless, diverse
activity types and overlong log files pose a significant challenge for LLMs in
directly discerning malicious ones within myriads of normal activities.
Furthermore, the faithfulness hallucination issue from LLMs aggravates its
application difficulty in ITD, as the generated conclusion may not align with
user commands and activity context. In response to these challenges, we
introduce Audit-LLM, a multi-agent log-based insider threat detection framework
comprising three collaborative agents: (i) the Decomposer agent, breaking down
the complex ITD task into manageable sub-tasks using Chain-of-Thought (COT)
reasoning;(ii) the Tool Builder agent, creating reusable tools for sub-tasks to
overcome context length limitations in LLMs; and (iii) the Executor agent,
generating the final detection conclusion by invoking constructed tools. To
enhance conclusion accuracy, we propose a pair-wise Evidence-based Multi-agent
Debate (EMAD) mechanism, where two independent Executors iteratively refine
their conclusions through reasoning exchange to reach a consensus.
Comprehensive experiments conducted on three publicly available ITD
datasets-CERT r4.2, CERT r5.2, and PicoDomain-demonstrate the superiority of
our method over existing baselines and show that the proposed EMAD
significantly improves the faithfulness of explanations generated by LLMs.