06.01.2015 Views

Cryptology - Unofficial St. Mary's College of California Web Site

Cryptology - Unofficial St. Mary's College of California Web Site

Cryptology - Unofficial St. Mary's College of California Web Site

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

138 CHAPTER 8. POLYALPHABETIC CIPHERS<br />

8.2 The Measure <strong>of</strong> Roughness<br />

It is not necessarily clear how the Index <strong>of</strong> Coincidence is connected to polyalphabetic<br />

ciphers. To glimpse this connection we must think back to the differences<br />

between the frequencies count <strong>of</strong> monoalphabetic and polyalphabetic<br />

ciphers. We have learned to recognize a Caesar Cipher from the intense highs<br />

and lows <strong>of</strong> its frequency count: there are many letters that occur quite <strong>of</strong>ten<br />

(the ciphertext versions <strong>of</strong> the etaoinshr letters) and many that seldom occur<br />

(the uvwxyz-types). However, in polyalphabetic ciphertexts the frequencies are<br />

less sharp; the highs are lower and the lows are higher. We might say that a<br />

Caesar Cipher has a frequency count that is much rougher than the frequency<br />

count <strong>of</strong> a Vigenère Cipher. Further, the longer the keyword <strong>of</strong> a Vigenère<br />

Cipher is, the smoother the frequency count is.<br />

To illustrate with an example, I’ve enciphered the quote <strong>of</strong> General Givierge<br />

from Section 7.6 using keys <strong>of</strong> various lengths. (I used the alphabet as the key, so<br />

the five letter key was ABCDE.) The frequency counts <strong>of</strong> the resulting ciphertexts<br />

appear in Figure 8.1. Notice that as the keys get longer, there are fewer numbers<br />

keylength A B C D E F G H I J K L M N O P Q R S T U V W X Y Z<br />

one 13 1 7 10 20 4 3 4 14 0 1 9 5 19 21 3 0 11 4 16 5 1 3 1 8 0<br />

three 3 6 6 7 15 10 8 4 6 7 4 5 3 10 13 17 9 4 6 9 11 6 2 2 5 5<br />

five 5 4 5 8 10 5 12 7 7 7 3 5 10 7 12 10 7 12 8 4 8 9 7 6 3 2<br />

ten 10 5 7 5 7 5 6 4 6 5 4 10 6 11 13 4 6 13 6 7 9 6 10 7 3 8<br />

twenty 10 11 9 10 11 7 8 5 7 8 5 8 6 6 6 5 5 6 4 3 8 6 7 9 7 6<br />

Figure 8.1: Frequency Counts: Same quote, different keylengths.<br />

that are much larger or much smaller than “average” and more that are in the<br />

middle. In general, the longer a keyword is, the smoother the frequencies are,<br />

and the shorter the keyword, the rougher the frequencies, which makes sense.<br />

After all, the point <strong>of</strong> polyalphabetic ciphers was to have each plaintext letter<br />

become many different cipherletters, causing the individual frequency counts to<br />

become more and more similar.<br />

Does the converse hold Can we somehow measure the “roughness” <strong>of</strong> a<br />

frequency count, and then use the measurement to estimate the keylength<br />

Let’s start by thinking about measuring roughness.<br />

Example: Consider the three sets <strong>of</strong> numbers<br />

{3, 3, 3, 3} {4, 0, 4, 4} {1, 4, 1, 6}.<br />

We can probably all agree that all these sets move from “smoothest” to “roughest.”<br />

Why do we feel so The latter two sets are rougher because their numbers<br />

are more spread out, are farther away from each other. What mathematical<br />

device can measure this

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!