The field of Natural Language Processing (NLP) is currently undergoing a
revolutionary transformation driven by the power of pre-trained Large Language
Models (LLMs) based on groundbreaking Transformer architectures. As the
frequency and diversity of cybersecurity attacks continue to rise, the
importance of incident detection has significantly increased. IoT devices are
expanding rapidly, resulting in a growing need for efficient techniques to
autonomously identify network-based attacks in IoT networks with both high
precision and minimal computational requirements. This paper presents
SecurityBERT, a novel architecture that leverages the Bidirectional Encoder
Representations from Transformers (BERT) model for cyber threat detection in
IoT networks. During the training of SecurityBERT, we incorporated a novel
privacy-preserving encoding technique called Privacy-Preserving Fixed-Length
Encoding (PPFLE). We effectively represented network traffic data in a
structured format by combining PPFLE with the Byte-level Byte-Pair Encoder
(BBPE) Tokenizer. Our research demonstrates that SecurityBERT outperforms
traditional Machine Learning (ML) and Deep Learning (DL) methods, such as
Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), in
cyber threat detection. Employing the Edge-IIoTset cybersecurity dataset, our
experimental analysis shows that SecurityBERT achieved an impressive 98.2%
overall accuracy in identifying fourteen distinct attack types, surpassing
previous records set by hybrid solutions such as GAN-Transformer-based
architectures and CNN-LSTM models. With an inference time of less than 0.15
seconds on an average CPU and a compact model size of just 16.7MB, SecurityBERT
is ideally suited for real-life traffic analysis and a suitable choice for
deployment on resource-constrained IoT devices.