06.01.2015 Views

Cryptology - Unofficial St. Mary's College of California Web Site

Cryptology - Unofficial St. Mary's College of California Web Site

Cryptology - Unofficial St. Mary's College of California Web Site

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5.6. BASIC LETTER CHARACTERISTICS 77<br />

the substitution for b, 24 for c’s substitute, etc. Thus there are<br />

26! = 26 × 25 × 24 × · · · × 2 × 1<br />

= 403, 291, 461, 126, 605, 635, 584, 000, 000<br />

different monoalphabetic substitution ciphers. How large <strong>of</strong> a number is this<br />

If we used a computer that could check one trillion different possibilities every<br />

second, we’d need about 12 million years to check all the possibilities!<br />

Of course, no one simply uses brute force to break monoalphabetic ciphers.<br />

In our current example X almost positively must be e. And most <strong>of</strong> etaoinshr<br />

probably comes from CFKQSTU. This cuts down immensely on the number <strong>of</strong><br />

possibilities. But to decrypt such a cipher in a truly finite amount <strong>of</strong> time we<br />

must go beyond simple frequency counts to consider the behaviors <strong>of</strong> the letters.<br />

5.6 Basic Letter Characteristics<br />

We’ve seen which letters are the most common (etaoinshr) and least common<br />

(vkjxqz). We next look at which letters appear first and last in words. We<br />

begin with the frequency information from earlier. (The initial and final letter<br />

percentages are from Sinkov’s study <strong>of</strong> 16410 words [Sinkov].)<br />

8.2 1.5 2.8 4.3 12.7 2.2 2.0 6.1 7.0 0.2 0.8 4.0 2.4<br />

a b c d e f g h i j k l m<br />

6.7 7.5 1.9 0.1 6.0 6.3 9.1 2.8 1.0 2.4 0.2 2.0 0.1<br />

n o p q r s t u v w x y z<br />

Figure 5.1: Letter Frequencies – Anywhere.<br />

11.0 4.6 5.6 2.8 2.5 4.1 1.8 3.9 5.6 0.6 0.5 2.1 3.5<br />

a b c d e f g h i j k l m<br />

2.4 7.2 4.7 0 3.1 7.4 15.9 1.4 0.6 5.1 0 0.7 0<br />

n o p q r s t u v w x y z<br />

Figure 5.2: Letter Frequencies – Initial Letters.<br />

Summarizing, the most common individual letters are<br />

1) Anywhere: etaoi, 4 vowels and t.<br />

2) Beginning words: tasoic, with t easily the most common.<br />

3) Ending words: edtsn (almost spells “endts”).<br />

4) Doubles: lesot.<br />

We put this summary into Figure 5.4.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!