These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
We determine the accuracy with which machine learning and deep learning
techniques can classify selected World War II era ciphers when only ciphertext
is available. The specific ciphers considered are Enigma, M-209, Sigaba,
Purple, and Typex. We experiment with three classic machine learning models,
namely, Support Vector Machines (SVM), $k$-Nearest Neighbors ($k$-NN), and
Random Forest (RF). We also experiment with four deep learning neural
network-based models: Multi-Layer Perceptrons (MLP), Long Short-Term Memory
(LSTM), Extreme Learning Machines (ELM), and Convolutional Neural Networks
(CNN). Each model is trained on features consisting of histograms, digrams, and
raw ciphertext letter sequences. Furthermore, the classification problem is
considered under four distinct scenarios: Fixed plaintext with fixed keys,
random plaintext with fixed keys, fixed plaintext with random keys, and random
plaintext with random keys. Under the most realistic scenario, given 1000
characters per ciphertext, we are able to distinguish the ciphers with greater
than 97% accuracy. In addition, we consider the accuracy of a subset of the
learning techniques as a function of the length of the ciphertext messages.
Somewhat surprisingly, our classic machine learning models perform at least as
well as our deep learning models. We also find that ciphers that are more
similar in design are somewhat more challenging to distinguish, but not as
difficult as might be expected.