AIセキュリティポータル K Program
Intrusion Detection at Scale with the Assistance of a Command-line Language Model
Share
Abstract
Intrusion detection is a long standing and crucial problem in security. A system capable of detecting intrusions automatically is on great demand in enterprise security solutions. Existing solutions rely heavily on hand-crafted rules designed by security operators, which suffer from high false negative rates and poor generalization ability to new, zero-day attacks at scale. AI and machine learning offer promising solutions to address the issues, by inspecting abnormal user behaviors intelligently and automatically from data. However, existing learning-based intrusion detection systems in the literature are mostly designed for small data, and they lack the ability to leverage the power of big data in cloud environments. In this paper, we target at this problem and introduce an intrusion detection system which incorporates large-scale pre-training, so as to train a large language model based on tens of millions of command lines for AI-based intrusion detection. Experiments performed on 30 million training samples and 10 million test samples verify the effectiveness of our solution.
Online learning and stochastic approximations
Leon Bottou
Published: 1998
Language models are few-shot learners
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei
Published: 2020
Semantics-aware malware detection
Mihai Christodorescu, Somesh Jha, Sanjit A Seshia, Dawn Song, Randal E Bryant
Published: 2005
Nearest neighbor pattern classification
Thomas Cover, Peter Hart
Published: 1967
Bert: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
Published: 2019
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification
K. He, X. Zhang, S. Ren, J. Sun
Published: 2015
Masquerade detection using profile hidden markov models
Lin Huang, Mark Stamp
Published: 2011
An application of machine learning to anomaly detection
Terran Lane, Carla E Brodley
Published: 1997
A theoretical framework for back-propagation
Y. LeCun
Published: 1988
Isolation Forest
F. T. Liu, K. M. Ting, Z. H. Zhou
Published: 2008
A new approach of intrusion detection with command sequence-to-sequence model
Wei Liu, Yu Mao, Linlin Ci, Fuquan Zhang
Published: 2022
Learning with noisy labels
Nagarajan Natarajan, Inderjit S Dhillon, Pradeep K Ravikumar, Ambuj Tewari
Published: 2013
Learning transferable visual models from natural language supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever
Published: 2021
Sentence-bert: Sentence embeddings using siamese bert-networks
Nils Reimers, Iryna Gurevych
Published: 2019
Learning internal representations by error propagation
David E Rumelhart, Geoffrey E Hinton, Ronald J Williams
Published: 1985
Estimating the support of a high-dimensional distribution
Bernhard Scholkopf, John C Platt, John Shawe-Taylor, Alex J Smola, Robert C Williamson
Published: 2001
Gpt-2c: a parser for honeypot logs using large pre-trained language models
Febrian Setianto, Erion Tsani, Fatima Sadiq, Georgios Domalis, Dimitris Tsakalidis, Panos Kostakos
Published: 2021
Attention is all you need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, Illia Polosukhin
Published: 2017
Singular value decomposition and principal component analysis
Michael E Wall, Andreas Rechtsteiner, Luis M Rocha
Published: 2003
A survey on malware detection using data mining techniques
Yanfang Ye, Tao Li, Donald Adjeroh, S Sitharama Iyengar
Published: 2017
Share