Classifying World War II Era Ciphers with Machine Learning

TOP Literature Database Classifying World War II Era Ciphers with Machine Learning

Cryptologia

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2307.00501

PDF

https://arxiv.org/pdf/2307.00501

Paper Information

Author: Brooke Dalton;Mark Stamp
Published: 7-2-2023
Updated: 8-30-2023
Affiliation: Department of Computer Science, San Jose State University
Country: United States of America
Conference: Cryptologia

Labels Estimated by AI

Hyperparameter Tuning Machine Learning Technology History of Cryptography

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

We determine the accuracy with which machine learning and deep learning techniques can classify selected World War II era ciphers when only ciphertext is available. The specific ciphers considered are Enigma, M-209, Sigaba, Purple, and Typex. We experiment with three classic machine learning models, namely, Support Vector Machines (SVM), $k$-Nearest Neighbors ($k$-NN), and Random Forest (RF). We also experiment with four deep learning neural network-based models: Multi-Layer Perceptrons (MLP), Long Short-Term Memory (LSTM), Extreme Learning Machines (ELM), and Convolutional Neural Networks (CNN). Each model is trained on features consisting of histograms, digrams, and raw ciphertext letter sequences. Furthermore, the classification problem is considered under four distinct scenarios: Fixed plaintext with fixed keys, random plaintext with fixed keys, fixed plaintext with random keys, and random plaintext with random keys. Under the most realistic scenario, given 1000 characters per ciphertext, we are able to distinguish the ciphers with greater than 97% accuracy. In addition, we consider the accuracy of a subset of the learning techniques as a function of the length of the ciphertext messages. Somewhat surprisingly, our classic machine learning models perform at least as well as our deep learning models. We also find that ciphers that are more similar in design are somewhat more challenging to distinguish, but not as difficult as might be expected.