CHAPTERFOURInformation TheoryInformation theory concerns the measure of information contained in data. The security ofa cryptosystem is determined by the relative content of the key, plaintext, and ciphertext.For our purposes a discrete probability space – a finite set X together with a probabilityfunction on X – will model a language. Such probability spaces may represent the spaceof keys, of plaintext, or of ciphertext, and which we may refer to as a space or a language,and to an element x as a message. The probability function P : X → R is defined to be anon-negative real-valued function on X such that∑P (x) = 1.x∈XFor a naturally occuring language, we reduce to a finite model of it by considering finitesets consisting of strings of length N in that language. If X models the English language,then the function P assigns to each string the probability of its appearance, among allstrings of length N, in the English language.4.1 EntropyThe entropy of a given space with probability function P is a measure of the informationcontent of the language. The formal definition of entropy isH(X) = ∑ x∈XP (x) log 2 (P (x) −1 ).For 0 < P (x) < 1, the value log 2 (P (x) −1 ) is a positive real number, and we defineP (x) log 2 (P (x) −1 ) = 0 when P (x) = 0. The following exercise justifies this definition.Exercise. Show that the limitlim x log 2(x −1 ) = 0.x→0 +31
What is the maximum value of x log 2 (x) and at what value of x does it occur?An optimal encoding for a probability space X is an injective map from X to strings oversome alphabet, such that the expected string length of encoded messages is minimized.The term log 2 (P (x) −1 ) is the expected bit-length for the encoding of the message x in anoptimal encoding, if one exists, and the entropy is the expected number of bits in a randommessage in the space.As an example, English text files written in 8-bit ASCII can typically be compressed to40% of the original size without loss of information, since the structure of the languageitself encodes the remaining information. The human genome encodes data for producingsequences of 20 different amino acid, each with a triple of letters in the alphabet {A, T, C, G}.The 64 possibile “words” (codons in genetics) includes more than 3-fold redundancy, inspecifying one of these 20 amino acids. Moreover, huge sections of the genome are repeatssuch as AAAAAA . . . , whose information can be captured by an expression like A n . Moreaccurate models for the languages specified by English or by human DNA sequences wouldpermit greater compression rates for messages in these languages.Example 4.1 Let X be the probability space {A, B, C} of three elements, and assume thatP (A) = 1/2, P (B) = 1/4, and P (C) = 1/4. The entropy of the space is thenP (A) log 2 (2) + P (B) log 2 (4) + P (C) log 2 (4) = 1.5.An optimal encoding is attained by the encoding of A with 0, B with 10, and C with 11.With this encoding one expects to use an average of 1.5 bits to transmit a message in thisencoding.The following example gives methods by which we might construct models for the Englishlanguage.Example 4.2 (Empirical models for English) First, choose a standard encoding —this might be an encoding as strings in the set {A, . . . , Z} or as strings in the ASCII alphabet.Next, choose a sample text. The text might be the complete works of Shakespeare, the shortstory Black cat of Edgar Allan Poe, or the U.S. East Coast version of the New York Timesfrom 1 January 2000 to 31 January 2000. The following are finite probability spaces forEnglish, based on these choices:1. Let X be the set of characters of the encoding and set P (c) to be the probability thatthe character c occurs in the sample text.2. Let X be the set of character pairs over the encoding alphabet and set P (x) to be theprobability that the pair x = c 1 c 2 occurs in the sample text.3. Let X be the set of words in the sample text, and set P (x) to be the probability thatthe word x occurs in the sample text.32 Chapter 4. Information Theory
- Page 1 and 2: Author (David R. Kohel) /Title (Cry
- Page 4 and 5: CONTENTS1 Introduction to Cryptogra
- Page 6: PrefaceWhen embarking on a project
- Page 10 and 11: information. We introduce here some
- Page 12 and 13: ut strings in A ∗ map injectively
- Page 14 and 15: CHAPTERTWOClassical Cryptography2.1
- Page 16 and 17: LV MJ CW XP QO IG EZ NB YH UA DS RK
- Page 18 and 19: As a special case, consider 2-chara
- Page 20 and 21: Note that if d k = 1, then we omit
- Page 22: ExercisesSubstitution ciphersExerci
- Page 25 and 26: Ciphertext-only AttackThe cryptanal
- Page 27 and 28: of size n, suppose that p i is the
- Page 29 and 30: Note that ZKZ and KZA are substring
- Page 31: Checking possible keys, the partial
- Page 34 and 35: sage: X = pt.frequency_distribution
- Page 38 and 39: For each of these we can extend our
- Page 40 and 41: in terms of the cryptosystem), then
- Page 42 and 43: CHAPTERFIVEBlock CiphersData Encryp
- Page 44 and 45: Deciphering. Suppose we begin with
- Page 46 and 47: The Advanced Encryption Standard al
- Page 48 and 49: 1. Malicious substitution of a ciph
- Page 50 and 51: locks M j−1 , . . . , M 1 as well
- Page 52: where X = K ⊕ M = (X 1 , X 2 , X
- Page 55 and 56: 6.2 Properties of Stream CiphersSyn
- Page 57 and 58: Exercise. Verify that the equality
- Page 59 and 60: n 2 n − 11 12 33 74 155 316 637 1
- Page 61 and 62: Exercise 6.6 In the previous exerci
- Page 63 and 64: Exercise 6.9 Compute the first 8 te
- Page 65 and 66: which holds since −4 = 17 + (−1
- Page 67 and 68: must therefore have a divisor of de
- Page 69 and 70: Shrinking Generator cryptosystemLet
- Page 72 and 73: CHAPTEREIGHTPublic Key Cryptography
- Page 74 and 75: Initial setup:1. Alice and Bob publ
- Page 76 and 77: We apply this rule in the RSA algor
- Page 78 and 79: the discrete logarithm problem (DLP
- Page 80 and 81: Man in the Middle AttackThe man-in-
- Page 82: Exercise 8.6 Fermat’s little theo
- Page 85 and 86: k < p − 1 with GCD(k, p − 1) =
- Page 88 and 89:
CHAPTERTENSecret SharingA secret sh
- Page 90:
using any t shares (x 1 , y 1 ), .
- Page 93 and 94:
sage-------------------------------
- Page 95 and 96:
sage: x.is_unit?Type:builtin_functi
- Page 97 and 98:
Python (hence SAGE) has useful data
- Page 99 and 100:
sage: n = 12sage: for i in range(n)
- Page 101 and 102:
sage: I = [55+i for i in range(3)]
- Page 103 and 104:
sage: I = [7, 4, 11, 11, 14, 22, 14
- Page 105 and 106:
ExercisesRead over the above SAGE t
- Page 107 and 108:
102
- Page 109 and 110:
Solution. The block length is the n
- Page 111 and 112:
Solution.below.The coincidence inde
- Page 113 and 114:
analysis of the each of the decimat
- Page 115 and 116:
arbitrary permutation of the alphab
- Page 117 and 118:
In order to understand naturally oc
- Page 119 and 120:
We do this by first verifying the e
- Page 121 and 122:
Solution.None provided.Linear feedb
- Page 123 and 124:
Multiplying each through by the con
- Page 125 and 126:
Solution. The linear complexity of
- Page 127 and 128:
If a, b, and c are as above, then f
- Page 129 and 130:
Exercise 8.5 Use SAGE to find a lar
- Page 131 and 132:
Solution. Now we can verify that e
- Page 133 and 134:
which has no common factors with p
- Page 135 and 136:
sage: p = 2^32+61sage: m = (p-1).qu
- Page 137 and 138:
sage: a5 := a^n5sage: c5 := c^n5sag
- Page 139 and 140:
The application of this function E
- Page 141 and 142:
5. (∗) How many elements a of G h
- Page 143:
1. The value f(0) of the polynomial