Audit-LLM: Multi-Agent Collaboration for Log-based Insider Threat Detection

Deep Learning for Insider Threat Detection: Review, Challenges and Opportunities

Shuhan Yuan, Xintao Wu

Published: 2020.5.26

Insider threats, as one type of the most challenging threats in cyberspace, usually cause significant loss to organizations. While the problem of insider threat detection has been studied for a long time in both security and data mining communities, the traditional machine learning based detection approaches, which heavily rely on feature engineering, are hard to accurately capture the behavior difference between insiders and normal users due to various challenges related to the characteristics of underlying data, such as high-dimensionality, complexity, heterogeneity, sparsity, lack of labeled insider threats, and the subtle and adaptive nature of insider threats. Advanced deep learning techniques provide a new paradigm to learn end-to-end models from complex data. In this brief survey, we first introduce one commonly-used dataset for insider threat detection and review the recent literature about deep learning for such research. The existing studies show that compared with traditional machine learning algorithms, deep learning models can improve the performance of insider threat detection. However, applying deep learning to further advance the insider threat detection task still faces several limitations, such as lack of labeled data, adaptive attacks. We then discuss such challenges and suggest future research directions that have the potential to address challenges and further boost the performance of deep learning for insider threat detection.

脅威モデル機械学習機械学習技術

An evaluation of log parsing with chatgpt

V. Le, H. Zhang

IEEE International Conference on High Performance Computing & Communications

Loggpt: Exploring chatgpt for log-based anomaly detection

J. Qi, S. Huang, Z. Luan

Published: 2023

arxiv

被引用数 2

Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

Jiandong Jin, Bowen Tang, Mingxuan Ma, Xiao Liu, Yunfei Wang, Qingnan Lai, Jia Yang, Changling Zhou

Published: 2024.3.1

We introduces Crimson, a system that enhances the strategic reasoning capabilities of Large Language Models (LLMs) within the realm of cybersecurity. By correlating CVEs with MITRE ATT&CK techniques, Crimson advances threat anticipation and strategic defense efforts. Our approach includes defining and evaluating cybersecurity strategic tasks, alongside implementing a comprehensive human-in-the-loop data-synthetic workflow to develop the CVE-to-ATT&CK Mapping (CVEM) dataset. We further enhance LLMs' reasoning abilities through a novel Retrieval-Aware Training (RAT) process and its refined iteration, RAT-R. Our findings demonstrate that an LLM fine-tuned with our techniques, possessing 7 billion parameters, approaches the performance level of GPT-4, showing markedly lower rates of hallucination and errors, and surpassing other models in strategic reasoning tasks. Moreover, domain-specific fine-tuning of embedding models significantly improves performance within cybersecurity contexts, underscoring the efficacy of our methodology. By leveraging Crimson to convert raw vulnerability data into structured and actionable insights, we bolster proactive cybersecurity defenses.

サイバーセキュリティ脆弱性管理戦略的洞察の向上

LogPrompt: Prompt Engineering Towards Zero-Shot and Interpretable Log Analysis

Y. Liu, S. Tao, W. Meng, J. Wang, W. Ma, Y. Zhao, Y. Chen, H. Yang, Y. Jiang, X. Chen

IEEE Computer Society

Bridging the gap: A pragmatic approach to generating insider threat data

J. Glasser, B. Lindauer

Published: 2013

AAAI

Leveraging partial symmetry for multi-agent reinforcement learning

X. Yu, R. Shi, P. Feng

Published: 2024

ACM

Enhancing multi-agent communication collaboration through gpt-based semantic information extraction and prediction

X. Deng, L. Zhou, D. Dong

Published: 2024

AAAI

Chain-of-thought improves text generation with citations in large language models

B. Ji, H. Liu, M. Du

Published: 2024

IEEE Computer Society

Proactive insider threat detection through graph learning and psychological context

O. Brdiczka, J. Liu, B. Price

Published: 2012

IEEE Trans. Inf. Forensics Secur.

Temporal and spatial locality: An abstraction for masquerade detection

J. B. Camina, R. Monroy, L. A. Trejo, M. A. Medina-Perez

Published: 2016

ACM

A tripwire grammar for insider threat detection

I. Agrafiotis, A. Erola, M. Goldsmith, S. Creese

Published: 2016

University of Plymouth

The insider threat prediction and specification language

G. Magklaras, S. Furnell

Published: 2012

IEEE International Conference on Big Data

Insider threat detection via hierarchical neural temporal point processes

S. Yuan, P. Zheng, X. Wu, Q. Li

Published: 2019

ACM

Log2vec: A heterogeneous graph embedding based approach for detecting cyber threats within enterprise

F. Liu, Y. Wen, D. Zhang

Published: 2019

Neurocomputing

Lmtracker: Lateral movement path detection based on heterogeneous graph embedding

Y. Fang, C. Wang, Z. Fang, C. Huang

Published: 2022

arxiv

被引用数 4

AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks

Jiacen Xu, Jack W. Stokes, Geoff McDonald, Xuesong Bai, David Marshall, Siyue Wang, Adith Swaminathan, Zhou Li

Published: 2024.3.2

Large language models (LLMs) have demonstrated impressive results on natural language tasks, and security researchers are beginning to employ them in both offensive and defensive systems. In cyber-security, there have been multiple research efforts that utilize LLMs focusing on the pre-breach stage of attacks like phishing and malware generation. However, so far there lacks a comprehensive study regarding whether LLM-based systems can be leveraged to simulate the post-breach stage of attacks that are typically human-operated, or "hands-on-keyboard" attacks, under various attack techniques and environments. As LLMs inevitably advance, they may be able to automate both the pre- and post-breach attack stages. This shift may transform organizational attacks from rare, expert-led events to frequent, automated operations requiring no expertise and executed at automation speed and scale. This risks fundamentally changing global computer security and correspondingly causing substantial economic impacts, and a goal of this work is to better understand these risks now so we can better prepare for these inevitable ever-more-capable LLMs on the horizon. On the immediate impact side, this research serves three purposes. First, an automated LLM-based, post-breach exploitation framework can help analysts quickly test and continually improve their organization's network security posture against previously unseen attacks. Second, an LLM-based penetration test system can extend the effectiveness of red teams with a limited number of human analysts. Finally, this research can help defensive systems and teams learn to detect novel attack behaviors preemptively before their use in the wild....

プロンプトインジェクション攻撃手法 LLMセキュリティ

arXiv

Llm agents can autonomously exploit one-day vulnerabilities

R. Fang, R. Bindu, A. Gupta, D. Kang

Published: 2024

Advancing TTP analysis: Harnessing the power of encoder-only and decoder-only language models with retrieval augmented generation

R. Fayyazi, R. Taghdimi, S. J. Yang

Published: 2024

Benchmarking large language models for log analysis, security, and interpretation

E. Karlsen, X. Luo, N. Zincir-Heywood, M. I. Heywood

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

AAAI Press

Toward autonomy: Metacognitive learning for enhanced AI performance

B. Conway-Smith, R. L. West

Published: 2024

Artif. Intell.

Programming by examples

M. A. Bauer

Published: 1979

Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, Ting Liu

Can llms produce faithful explanations for fact-checking? towards faithful explainable fact-checking via multi-agent debate

K. Kim, S. Lee, K. Huang

Insider threat test dataset

PicoDomain: A Compact High-Fidelity Cybersecurity Dataset

Craig Laprade, Benjamin Bowman, H. Howie Huang

Published: 2020.8.21

Analysis of cyber relevant data has become an area of increasing focus. As larger percentages of businesses and governments begin to understand the implications of cyberattacks, the impetus for better cybersecurity solutions has increased. Unfortunately, current cybersecurity datasets either offer no ground truth or do so with anonymized data. The former leads to a quandary when verifying results and the latter can remove valuable information. Additionally, most existing datasets are large enough to make them unwieldy during prototype development. In this paper we have developed the PicoDomain dataset, a compact high-fidelity collection of Zeek logs from a realistic intrusion using relevant Tools, Techniques, and Procedures. While simulated on a small-scale network, this dataset consists of traffic typical of an enterprise network, which can be utilized for rapid validation and iterative development of analytics platforms. We have validated this dataset using traditional statistical analysis and off-the-shelf Machine Learning techniques.

データ生成異常検出手法情報漏洩分析

Comput. Secur.

Unveiling shadows: A comprehensive framework for insider threat detection based on statistical and sequential analysis

H. Xiao, Y. Zhu, B. Zhang

Published: 2024

IEEE International Performance, Computing, and Communications Conference

Hetglm: Lateral movement detection by discovering anomalous links with heterogeneous graph neural network

X. Sun, J. Yang

Published: 2022