06.01.2015 Views

Cryptology - Unofficial St. Mary's College of California Web Site

Cryptology - Unofficial St. Mary's College of California Web Site

Cryptology - Unofficial St. Mary's College of California Web Site

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

196 CHAPTER 10. TRANSPOSITION CIPHERS<br />

cipher will generally have too many jvkxyq’s and too few etaoinshr’s. To<br />

our eyes, trained to read English, transposition ciphers will look like a salad<br />

<strong>of</strong> letters (to mis-translate Bazeries’ quote), worthwhile components but oddly<br />

mixed, whereas substitution ciphertexts look unappetizing, with too many odd<br />

letters.<br />

10.6 Letter Connections<br />

Decrypting a transposition cipher is a bit like putting Humpty Dumpty back<br />

together again – all the pieces are given, we just need to determine which goes<br />

where. When doing this, we will continually have a letter and will be wondering<br />

which letter most likely came just before it in the message, and which letter<br />

likely came just after it. These are questions <strong>of</strong> conditional probability. Notice<br />

the difference between<br />

and<br />

“This letter is a T. How likely is it that the next letter an H”<br />

“How likely is it that this letter is a T and the next letter is an H”<br />

Figure 10.2 provides the answer to the first question, in percent. 9 In the row<br />

centered by T, just to the right <strong>of</strong> T appears H 32 , which indicates that T is followed<br />

by H almost 1/3 <strong>of</strong> the time. Similarly, the appearance <strong>of</strong> U 4 E 17 O 18 A 22 I 26 just<br />

before N show that N is almost always preceded by a vowel, with that vowel<br />

being I about 1/4 <strong>of</strong> the time. And that when the rare V does appear, it is<br />

usually followed by E. 10<br />

For comparison, the answer to the section question (“how likely is it that<br />

this pair is “TH”) appears in Figure 10.3. This figure is the bigram companion<br />

to our standard frequency chart Figure 1.3, as it shows how likely each possible<br />

bigram is in standard English. 11<br />

9 These percentages were computed using “The Brown Corpus.” The Brown Corpus <strong>of</strong><br />

<strong>St</strong>andard American English was compiled by W.N. Francis and H. Kucera at Brown University,<br />

Providence, RI, from one million words <strong>of</strong> American English texts printed in 1961. The texts<br />

sampled came from fifteen different categories ranging from “Reportage” (The Philadelphia<br />

Inquirer, May 10, 1961, p.49) to “Popular Lore” (Jack Kaplan, “The Health Machine Menace:<br />

Therapy by Witchcraft”) to “Romance” (Samuel Elkin, “The Ball Player,” Nugget, October,<br />

1961). By modern standards, this corpus is considered small and dated.<br />

10 We remarked back in Chapter 6 that vowels like to combine with consonants. Look at<br />

the vowels rows to ascertain the validity <strong>of</strong> this statement. Likewise, that H is more likely to<br />

precede vowels while N and S are more likely to follow them.<br />

11 The numbers are in %%, meaning that you should divide by 100 and add % to the end.<br />

So the 13 in the BA entry means that BA appears .13% <strong>of</strong> the time, about 1/10 <strong>of</strong> 1 percent <strong>of</strong><br />

the time. Values smaller than 0.01% have been left out.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!