07.07.2023 Views

Implementing-cryptography-using-python

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

120 Chapter 4 ■ Cryptographic Math and Frequency Analysis

Frequency Analysis

The primary focus of this section is to teach the computer to recognize when

a set of letters matches the frequency distribution of plaintext English. You

can apply the information in this section in a number of different ways. The

ultimate goal is to give you the tools you need to be able to crack a number of

classical ciphers.

Frequency analysis is the study of the frequency of letters or the combination

of letters. It is based on the fact that, within any written language, certain letters

and combinations of letters will occur with varying frequencies. When you

examine the frequency of letters used in the English language, you find that the

letters E, T, A, and O are the most common, while Z, Q, and X are found with

much less frequency. Examine Figure 4.5 to see the occurrence of each letter.

Figure 4.5: FA.py

We will also find higher frequency of letter combinations such as TH, ER, ON,

and AN. These combinations are known as bigrams or digraphs. There are also

common pairs of repeating letters such as SS, EE, TT, and FF. When you encrypt

English plaintext into ciphertext using many historical ciphers, these same

properties will be preserved and can be exploited in a ciphertext-only attack.

As you will recall, historical substitution ciphers replace each letter in the

plaintext with an alternate letter. If you use the Caesar cipher to encrypt the

letter E with a key of 3, then each occurrence of E will now be H. If you have

enough text to determine that H is the letter with the most frequency, then you

may determine that the key is the difference between the letters H and E. This

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!