Use this URL to cite or link to this record in EThOS:
Title: Compression-based methods for the automatic cryptanalysis of classical ciphers
Author: Al-Kazaz, Noor
ISNI:       0000 0004 7967 7080
Awarding Body: Bangor University
Current Institution: Bangor University
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
The study documented in this thesis investigates the effectiveness of compression in the field of cryptanalysis, specifically for the automatic cryptanalysis of classical ciphers, initially for the English language. Several new compression-based cryptanalysis methods are developed against these ciphers. The new methods use the well-known compression scheme-prediction by partial matching (PPM)-and have been applied to automatic cryptanalysis for three main classical ciphers: simple substitution, transposition and Playfair ciphers. The extensive set of case studies adopted in this research have validated the new methods, which have proven to be very effective in the cryptanalysis of these cases with a high success rate-for substitution ciphers, 92% of the cryptograms were correctly solved with no errors and 100% with just three errors or less; a 100% decryption success rate was achieved for transposition ciphers and 87% was achieved for Playfair ciphers. This study led to the decipherment of more challenging cases, such as very short ciphertexts with no probable words. The Gzip compression scheme has also been applied to the automatic decryption of simple substitution and transposition ciphers, but the results showed that Gzip, in comparison to PPM, was not as effective. A third compressor, Bzip2, could not be used as the nature of that scheme made its use unfeasible. The PPM compression-based cryptanalysis methods offered significant improvements in decryption accuracy in a diverse range of experiments while being computationally more efficient compared to previously published techniques. In addition, extensive investigations were conducted to determine the most appropriate type of PPM scheme to be applied in the cryptanalysis of these ciphers. These findings have highlighted why better models are of vital importance in cryptology. In particular, the study has shown how a good model of the source (i.e. the PPM compression model)-a method that shows a high level of performance when applied to different language modelling tasks-can also be effectively used in the automatic decryption of different classical ciphers. As spaces have been traditionally omitted from ciphertext, a full cryptanalysis mechanism which also automatically adds spaces to decrypted texts, again using a compression-based approach, has also been proposed to achieve readability. This work has also investigated whether the newly devised cryptanalysis methods are applicable to another language (specifically Arabic as it is a language non-related to English). Arabic is a rich morphological language with its own characteristics that differentiate it from other languages. The current study has specifically adapted new compression-based methods for the automatic cryptanalysis of classical Arabic ciphers (simple substitution, transposition and Playfair ciphers). Although the experiments conducted with Arabic ciphers have generally been less effective than those with classical English ciphers, excellent results have been achieved-for Arabic substitution ciphers, 72% of the cryptograms were successfully solved without any errors and over 91% with just three errors or less; a 97% decryption success rate was achieved for Arabic transposition ciphers, with this result being 73% for Arabic Playfair ciphers.
Supervisor: Teahan, William Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: compression ; PPM ; cryptanalysis ; plain text recognition ; word segmentation