Cryptography and Security - Computer Science

CS4413a – fall 2011 

Cryptography and Security 

LUCIAN ILIE 

c○ 2011 by Lucian Ilie

CS4413a – Cryptography and Security – fall 2011 – c○ 2011 by Lucian Ilie 2 

1 INTRODUCTION 

1.1 Why do we need cryptography 

“Unfortunately, the technical wizardry enabling remote collaborations is founded on broadcasting 

everything as sequences of zeros and ones that one’s dog wouldn’t recognize. What is to distinguish 

a digital dollar when it is as easily reproducible as the spoken word How do we converse privately 

when every syllable is bounced off a satellite and smeared over an entire continent How should a 

bank know that it really is Bill Gates requesting from his laptop in Fiji a transfer of $10,000,000,000 

to another bank 

Fortunately, the magical mathematics of cryptography can help. Cryptography provides techniques 

for keeping information secret, for determining that information has not been tampered with, and 

for determining who authored pieces of information.” 

1.2 Goals of cryptography 

Ronald Rivest 

Foreword to Handbook of Applied Cryptography 

- fundamental objective – to enable Alice and Bob to communicate over an insecure channel such that Oscar 

cannot understand what is being said; see Fig. 1. 

Adversary 

(Oscar) 

encryption 

(ciphertext) 

e (x) = c 

K 

x 

plaintext 

source 

Alice 

c 

UNSECURED CHANEL 

decryption 

d (c) = x 

K 

x 

destination 

Bob 

Figure 1: Two-party communication using encryption 

1. Confidentiality – secrecy of data (historical goal); ensures that the data is not understood by anyone 

other than the intended receiver 

2. Data Integrity – prevents unauthorized alteration of data; must be able to detect data manipulation 

(i.e., insertion, deletion, substitution) 

3. Authentication – identification of both parties (the sender and the receiver should identify each other) 

and of information (origin, date of origin, data content, time sent, etc.) 

- data origin authentication – verifies the source of data 

- entity authentication – verifies the identity of the other party; i.e., ensures that you are not talking to 

an impostor 

4. Non-repudiation – prevents a party from denying previous actions


Example 1.1. (i) User A transmits a file to user B. User C, who is not authorized to read it, intercepts the 

file during transmission. 

(ii) A network manager D sends to a computer E an updated file with user having access to E. User F 

intercepts the message and adds or deletes entries. 

(iii) As in (ii), but F now can create a new file and sends it to E which believes the files comes from D. 

(iv) A customer C sends a message to a stockbroker D with instructions for various transactions. Subsequently, 

the investments lose value and the customer denies sending the message. 

□ 

Cryptography – the study of mathematical techniques related to aspects of information security such as 

confidentiality, data integrity, and authentication. 

Cryptanalysis – the study of the mathematical techniques for attempting to defeat cryptographic techniques, 

and, more generally, information security services 

Cryptology – the study of cryptography and cryptanalysis 

1.3 Definitions and notations 

- plaintext (message) – the (non-encrypted) text of the message 

- ciphertext – plaintext encrypted 

- cryptosystem (cipher) – (P,C,K,E,D) 

P – finite set of plaintexts 

C – finite set of ciphertexts 

K – finite set of keys 

for each K ∈ K: 

e K ∈ E, e K : P → C – encryption rule (algorithm) 

d K ∈ D, e K : C → P – decryption rule (algorithm) 

such that d K (e K (x)) = x, for any plaintext x ∈ P 

- sender (Alice) 

- receiver (Bob) 

- adversary or opponent or attacker (Oscar) 

Why keys – only encryption and decryption functions are enough 

(i) - if some particular transformation revealed – the entire scheme need not be redesigned; just a new key 

(ii) - changing keys frequently – sound cryptographic practice 

(analogy: resettable combination lock) 

- encryption and decryption protocols 

1. Alice and Bob agrees on a random key K ∈ K 

1. Alice has the plaintext x = x 1 x 2 ...x n , x i ∈ P 

2. Alice computes the ciphertext y = y 1 y 2 ...y n , where y i = e K (x i ) 

3. Bob receives y and computes x = d K (y 1 )...d K (y n ) 

Notes: 

- the encryption function must be injective 

- if P = C, then the encryption function is a permutation 

- a fundamental premise in cryptography is that the sets P,C,K are public knowledge 

1.4 Security 

Security attacks – specifies whether the adversary interferes or not with the information 

- passive – the goal is to obtain the information transmitted 

- release of message content – e.g., from a telephone conversation, e-mail, transferred files, etc.


- traffic analysis – e.g., location and identity of communicating hosts, frequency and length of messages, 

the nature of messages 

- active attacks – involves some modification of the data stream 

- masquerade – pretending to be a different entity 

- replay – passive capture of a data unit and subsequent retransmission 

- modification of messages 

- denial of service 

Passive attacks are difficult to detect but easy to prevent whereas active attacks are easy to detect but 

difficult to prevent. 

Security attacks can also be divided into on-line and off-line. 

Example 1.2. Trying to find a password has no chance on-line but becomes quite possible off-line. 

Types of attacks – specifies the information available to the adversary 

- ciphertext-only – the adversary possesses only a string of ciphertext 

- known plaintext – the adversary possesses a string of plaintext and the corresponding ciphertext 

- chosen plaintext – the adversary selects a string of plaintext and then obtains the corresponding ciphertext 

- chosen ciphertext – the adversaryselects a stringofciphertext and then obtainsthe correspondingplaintext 

The attacks can also be classified by the approach used into 

- cryptanalysis – when the attack relies on the nature of the algorithm plus some information as the ones 

above and 

- brute force – when all keys (on average half) are tried until a good one is found; below are some estimates 

on the time needed by brute force attacks for various key sizes and speeds. 

Key size (bits) Number of keys time (1 encryption/µs) time (10 6 encryptions/µs) 

32 2 32 ≈ 4.3×10 9 2 31 µ s ≈ 35.8 min ≈ 2.15µs 

56 2 56 ≈ 7.2×10 16 2 55 µs ≈ 1142 years ≈ 10.01 hours 

128 2 128 ≈ 3.4×10 38 2 127 µs ≈ 5.4×10 24 years ≈ 5.4×10 18 years 

168 2 168 ≈ 3.7×10 50 2 167 µs ≈ 5.9×10 36 years ≈ 5.9×10 30 years 

26 characters 26! ≈ 4×10 26 ≈ 2×10 26 µs ≈ 6.4×10 12 years ≈ 6.4×10 6 years 

It is important to mention that trying a key does not mean only decrypting using that key but also identifying 

whether the obtained plaintext is the valid one. For instance, if a random (meaningless) sequence of bits is 

encrypted, then it is impossible to decrypt simply because even after all keys are tried the attacker does not 

know which one is the correct plaintext. 

Adversarial goal – specifies what it means for the adversary to “break” the system 

- complete break – find out the key 

- partial break – decrypt some ciphertext (or determine some partial information about the plaintext) 

- distinguishability – distinguish between valid ciphertext and random strings 

Security level – specifies the computational resources available to the adversary 

- unconditional security – infinite computational resources 

- computational security – measures the amount of computational effort required, by the best currently 

known methods, to defeat a system 

- provable secure – the difficulty of breaking a system is shown to be essentially as difficult as solving a 

well-known (supposedly) difficult problem (usually number-theoretic) 

In practice a system is usually called secure if either the cost to break it exceeds the value of the information 

obtained or the time required to break it exceeds the lifetime of the information. Also, any attack should take 

no less than brute force. 

Ciphers 

- by types of operations


- substitutions – each element of the plaintext (bit, letter, group of bits or letters) is mapped into another 

element 

- transpositions (permutation) – elements of plaintexts are rearranged 

- number of keys used 

- one for both sender and receiver – symmetric encryption (see below) 

- two different keys – public-key encryption (see below) 

- by the way the plaintext is processed 

- block cipher – one block of the input is processed at a time producing one block in the output 

- stream cipher – the input is processed continuously producing one element of the output at a time 

1.5 Symmetric-key encryption 

- for any pair (e K ,d K ), it is computationally easy to determine d K knowing only e K 

- both must be secret 

- called also secret-key or conventional encryption 

- see Fig. 2 

Oscar 

key 

source 

e 

SECURE CHANEL 

e 

encryption 

(ciphertext) 

e (x) = c 

K 

x 

plaintext 

source 

Alice 

c 


decryption 

d (c) = x 

K 

x 

destination 

Bob 

Figure 2: Two-party communication using encryption and a secure channel for key exchange 

Key distribution problem – finding an efficient method to agree upon and exchange keys securely 

1.6 Public-key encryption 

- for any pair (e K ,d K ), it is computationally infeasible to determine d K knowing e K 

- e K can be made public 

- anyone can encrypt 

- only Bob can decrypt 

- see Fig. 3 

(analogy: box with a resettable combination lock) 

The encryption function is trapdoor one-way function 

- one-way – y = f(x) is easy to compute but f −1 (y) is computationally infeasible 

- trapdoor one-way – a one-way function with the property that given some additional information 

(trapdoor information) it becomes feasible to compute f −1 (y)


Oscar 

e 


key 

source 

encryption 

(ciphertext) 

e (x) = c 

K 

x 

plaintext 

source 

Alice 

c 


decryption 

d (c) = x 

K 

d 

x 

destination 

Bob 

Figure 3: Encryption using public-key techniques 

Example 1.3. A very intuitive example of a trapdoor one-way function is the following. Assume we take the 

phone book of a large city, say Toronto, and produce another book which has the same entries but sorted by 

phone numbers instead of names. The one-way function, f, associates with each name the corresponding phone 

number. It is very easy to compute f; just look into the phone book. But if you want to compute the inverse 

of f, that is very difficult; given a phone number, one has to read all entries in the phone book until the person 

having that phone number is found. The trapdoor is the book ordered by phone numbers. Having it makes 

computing f −1 as easy as computing f. 

□ 

Example 1.4. One-way function - discrete logarithm problem 

f : {1,2,...,16} → {1,2,...,16} 

f(x) = 3 x mod 17 

f(x) is relatively easy to compute 

f −1 (7) = (answer: 11) 

□ 

Example 1.5. One-way function – integer factorization problem 

- multiplication of two integers is easy 

- what are the factors of 2624653723 (answer: 48611 and 53993) □ 

Example 1.6. Trapdoor one-way function 

(i) Subset-sum problem - NP-complete 

- given (s 1 ,s 2 ,...,s n ,T) positive integers 

- find (if any) x = (x 1 ,x 2 ,...,x n ) binary vector such that 

n∑ 

x i s i = T 

i=1 

(ii) Subset-sum problem for superincreasing vectors - easy 

(s 1 ,s 2 ,...,s n ) is superincreasing if s j > ∑ j−1 

i=1 s i, 2 ≤ j ≤ n 

(iii) Trapdoor version – we have a superincreasing vector and transform it such that it looks ordinary 

- choose a prime modulus p > ∑ n 

i=1 s i and a multiplier 1 ≤ a ≤ p−1 

- put t i = as i mod p; t = (t 1 ,t 2 ,...,t n ) looks ordinary


y = e K (x 1 ,...,x n ) = 

n∑ 

x i t i 

trapdoor: s, p and a – knowing them Bob can decrypt easily (superincreasing vector) 

- Bob computes z = a −1 y mod p 

and solves the (easy) problem (s 1 ,...,s n ,z) 

i=1 

□


2 SEVERAL CLASSICAL SYSTEMS 

2.1 Modular arithmetic 

a,b,m ∈ Z (integers), m > 0 

a ≡ b (mod m) iff m divides b−a (m is called modulus) 

a = q 1 m+r 1 , b = q 2 m+r 2 (q 1 and 0 ≤ r 1 ≤ m−1 are unique) 

a mod m = r 1 is the remainder of a divided by m (q 1 is the quotient) 

a ≡ b (mod m) iff r 1 = r 2 

a mod m means that a is reduced modulo m 

Arithmetic modulo m 

Z m = {0,1,2,...,m−1} 

operations: + and ×; done like in Z with the result reduced modulo m 

example: 11×13 = 15 in Z 16 

rules of modular arithmetic: (Z m ,+,×) is a commutative ring 

addition: closed, commutative, associative, (additive) identity: 0; (additive) inverse: −a 

multiplication: closed, commutative, associative, (multiplicative) identity: 1 

distributivity of multiplication over addition 

2.2 The shift cipher 

We shall use Z 26 since there are 26 letters in English 

- the correspondence is 

A B C D E F G H I J K L M 

0 1 2 3 4 5 6 7 8 9 10 11 12 

N O P Q R S T U V W X Y Z 

13 14 15 16 17 18 19 20 21 22 23 24 25 

The shift cipher is called monoalphabetic since each letter is always mapped to the same letter. 

The Shift Cipher 

P = C = K = Z 26 

encryption: e K (x) = x+K mod 26 

decryption: d K (y) = y −K mod 26 

Example 2.1. Here we have K = 11: 

x = wewillmeetatmidnight 

e 11 (x) = HPHTWWXPPELEXTOYTRSE 

□ 

Cryptanalysis (ciphertext only) 

– the Shift Cipher can be easily broken by exhaustive key search – only 26 keys 

2.3 The substitution cipher 

The Substitution Cipher 

P = C = Z 26 (or the English alphabet)


K = {π | π is a permutation of Z 26 } 

encryption: e π (x) = π(x) 

decryption: d π (y) = π −1 y 

- monoalphabetic cipher 

Example 2.2. Consider the permutation 

( ) 

a b c d e f g h i j k l m n o p q r s t u v w x y z 

π = 

X N Y A H P O G Z Q W B T S F L R C V M U E K J D I 

We have then 

x = thisciphertextcannotbedecripted 

e π (x) = MGZVYZLGHCMHJMYXSSFMNHAHYCDLMHA 


- exhaustive key search is infeasible since there are 26! keys 

- can be decrypted using frequency analysis (long enough messages) 

□ 

2.4 The affine cipher 

Congruences 

1. the congruence mod m is an equivalence relation 

2. If a ≡ b mod m and c ≡ d mod m, then a±c ≡ b±d mod m 

3. If a ≡ b mod m and d | m, then a ≡ b mod d 

4. If a ≡ b mod m and a ≡ b mod n with gcd(m,n) = 1, then a ≡ b mod mn (m,n are called relatively 

prime ) 

- multiplicative inverse of a is a −1 such that aa −1 ≡ a −1 a ≡ 1 (mod m) 

Theorem 2.3. The congruence ax ≡ b mod m has aunique solution x ∈ Z m for every b ∈ Z m iff gcd(a,m) = 1. 

Proof. If gcd(a,m) = 1 and ax 1 ≡ ax 2 mod m, then m | a(x 1 −x 2 ). We must have then x 1 = x 2 . Thus, 

for every b, the congruence has at most one solution. Therefore, it has exactly one. 

If d = gcd(a,m) ≥ 2, then ax ≡ 1 mod m implies d | ax−1 and so d | 1, a contradiction. □


Corollary 2.4. a ∈ Z m has a multiplicative inverse iff gcd(a,m) = 1. 

- field – a ring in which every non-zero element has an inverse 

- if m is prime, then Z m is a commutative field 

Euler’s phi-function φ(m) gives the number of integers in Z m that are relatively prime with m 

Theorem 2.5. If m = 

φ(m) = 

n∏ 

(p ei 

i −p ei−1 

i=1 

i ). 

The Affine Cipher 

n∏ 

i=1 

P = C = Z 26 

K = {(a,b) ∈ Z 26 ×Z 26 | gcd(a,26) = 1} 

encryption: e (a,b) (x) = ax+b mod 26 

decryption: d (a,b) (y) = a −1 (y −b) mod 26 

- monoalphabetic cipher 

p ei 

i , p i distinct primes and e i ≥ 1, then 


- number of keys is mφ(m); e.g., for m = 60, there are 960 keys 

- can be decrypted using frequency analysis; we guess two letters, compute a and b and then test whether 

the guess was correct 

Example 2.6. Assume the ciphertext 

FMXVEDKAPHFERBNDFRXRSREFMORUDSDKDVSHVUFEDKAPRKDLYEVLRHHRH 

Most frequent letters: R (8), D (7), E, H, K (5), and F, S, V (4). 

- e encrypted as R and t as D give a = 6, illegal 

- e encrypted as R and t as E give a = 13, illegal 

- e encrypted as R and t as H give a = 8, illegal 

- e encrypted as R and t as K give a = 3, legal; b = 5, d K (y) = 9y −19 which gives meaningful message, so 

the key must be correct 

2.5 The Vigenère cipher 

algorithmsarequitegeneraldefinitionsofarithmeticprocesses 

□ 

The Vigenère Cipher 

P = C = K = (Z 26 ) m 

encryption (key K = (k 1 ,...,k m )): 

e K (x 1 ,...,x m ) = (x 1 +k 1 mod 26,...,x m +k m mod 26) 

decryption: d K (y 1 ,...,y m ) = (y 1 −k 1 mod 26,...,y m −k m mod 26) 

The Vigenère cipher is not monoalphabetic since the same letter can be mapped to several different letters. 

It is called polyalphabetic. Frequency analysis does not work here! At least as done so far.


Example 2.7. K = Cipher, m = 6 

thiscryptosystemisnotsecure 

CIPHERCIPHERCIPHERCIPHERCIP 

VPXZGIAXIVWPUBTTMJPWIZITWZT 

□ 

Figure 4: Vigenère square 


- number of keys: 26 m – too large 

- frequency of letters is not relevant 

- considered unbreakable for long time until Kasiski 

Kasiski’s method 

- find first the length of the key 

- key observation: identical segments of the plaintext which are at distance divisible by m will be encrypted 

the same way 

- find several pairs of identical segments in the ciphertext 

- the greatest common divisor will give (with a high probability) m 

- use frequency analysis for each class of letters encrypted the same way 

2.6 The Hill cipher 

The Hill Cipher 

P = C = (Z 26 ) m 

K = {K | K is an m×m invertible matrix over Z 26 } 

encryption: e K (x) = xK all operations in Z 26 

decryption: d K (y) = yK −1 all operations in Z 26 

- polyalphabetic system 

Example 2.8. 

K = 

( 

11 8 

3 7 

) 

K −1 = 

( 

7 18 

23 11 

)


x = july = ((9,20),(11,24)), y = ((3,4),(11,22)) = DELW 

□ 

Cryptanalysis (known or chosen plaintext) 

-Oscarknows(chooses)mplaintextsx i ∈ (Z 26 ) m and(findsout)thecorrespondingciphertextsy i , 1 ≤ i ≤ m 

- consider the matrices X,Y ∈ (Z 26 ) m×m having the rows x i ’s and y i ’s 

- the equation Y = XK gives the key K = X −1 Y (assuming X is invertible; if chosen plaintext, then Oscar 

will make sure of that) 

Example 2.9. Assume m = 2 and the plaintext friday is encrypted as PQCFKU, i.e., e K (5,17) = (15,16), 

e K (8,3) = (2,5), e K (0,24) = (10,20). From the first two: 

( ) ( ) 

15 16 5 17 

= K 

2 5 8 3 

and so 

K = 

( 5 17 

8 3 

) −1 ( 15 16 

2 5 

) 

= 

( 9 1 

2 15 

)( 15 16 

2 5 

) 

= 

( 7 19 

8 3 

) 

This can be verified by the third pair. 

□ 

2.7 The permutation cipher 

Known also as transposition cipher. 

The Permutation Cipher 

P = C = (Z 26 ) m 

K = {π | π is a permutation of {1,2,...,m}} 

encryption: e π (x 1 ,...,x m ) = (x π(1) ,...,x π(m) ). 

decryption: d π (y 1 ,...,y m ) = (y π −1 (1),...,y π −1 (m)) 

- polyalphabetic system 

Example 2.10. Suppose m = 6 and π = ( ) 

1 2 3 4 5 6 

3 5 1 6 4 2 The inverse of π is π −1 = ( 1 2 3 4 5 6 

3 6 1 5 2 4) 

We can then use 

π for encryption as below: 

shesel lsseas hellsb ythese ashore 

EESLSH SALSES LSHBLE HSYEET HRAEOS 

We show next that the permutation cipher is a particular case of Hill cipher. Given π we construct the 

matrix K π = (kij) by 

{ 

1 if i = π(j) 

k ij = 

0 otherwise 

(K π is a permutation matrix.) It is easy to see that encrypting using π in the permutation cipher is the same 

same as encrypting using K π in Hill cipher. Moreover, Kπ −1 = K π −1. 

For the example above, we have 

⎛ ⎞ ⎛ ⎞ 

0 0 1 0 0 0 0 0 1 0 0 0 

0 0 0 0 0 1 

0 0 0 0 1 0 

K π = 

1 0 0 0 0 0 

⎜ 0 0 0 0 1 0 

Kπ −1 = 

1 0 0 0 0 0 

⎟ ⎜ 0 0 0 0 0 1 

⎟ 

⎝ 0 1 0 0 0 0 ⎠ ⎝ 0 0 0 1 0 0 ⎠ 

0 0 0 1 0 0 0 1 0 0 0 0 

□


2.8 Stream ciphers 

- block ciphers – plaintext elements encrypted using the same key 

- stream ciphers – keystream z = z 1 z 2 ... 

y = y 1 y 2 ... = e K (x 1 )e K (x 2 )... 

y = y 1 y 2 ... = e z1 (x 1 )e z2 (x 2 )... 

- z i depends on the key K and the previous plaintexts 

- synchronous – independent of the plaintexts (a generator takes K as input and produces the key stream) 

- non-synchronous – dependent of previous plaintext or ciphertext. 

- periodic – the keystream is periodic 

Example 2.11. Vigenère cipher is a periodic synchronous stream cipher with period the length of the key □ 

- assume P = C = L = Z 2 , L is the keystream alphabet 

m−1 

∑ 

- linear z i+m = c j z i+j mod 2, c j ∈ Z 2 are fixed constants 

j=0 

K = (k 1 ,k 2 ,...,k m ,c 0 ,...,c m−1 ) 

- the keystream is obviously periodic 

-if(c 0 ,...,c m−1 ) aresuitablychosen, then any(k 1 ,...,k m ) ≠ (0,...,0)willgiveriseto aperiodickeystream 

with (maximum) period 2 m − 1 which is desirable (Vigenère was cryptanalyzed using the fact it has a short 

period) 

Example 2.12. Take m = 4 and z i+4 = z i +z i+1 mod 2. If the initial vector is different from (0,0,0,0) then 

we get a keystream with period 15: E.g.: 

1,0,0,0,1,0,0,1,1,0,1,0,1,1,1,1,... 

Such a linear (synchronous) stream cipher can be efficiently implementated in hardware using a linear 

feedback shift register (LFSR). 

- k 1 - the next keystream bit 

- k 2 ,...,k m shift left 

- k m becomes ∑ m−1 

j=0 c jk j+1 (linear feedback) 

An example of a LFSR is given in Fig. 5. It generates the keystream of Example 2.12. 

□ 

+ 

k k 

1 

2 k 3 

k 4 

Figure 5: A LFSR 

Cryptanalysis of LFSR 

All operations are linear so it is vulnerable to a known-(chosen-)plaintext attack. 

A simple example of a non-synchronous stream cipher is the Autokey cipher.


The Autokey Cipher 

P = C = K = L = Z 26 

z 1 = K and z i = x i−1 , for i ≥ 2 

encryption: e z (x) = (x+z) mod 26 

decryption: d z (y) = (y −z) mod 26 

Example 2.13. Suppose K = 8, we have the following encryption: 

2.9 One-time pad 

rendezvous 

irendezvou 

ZVRQHDUJIM 

Notice that the autokey cipher is a modified Vigenère cipher where the key is the plaintext itself shifted by a 

fixed amount. Vigenère was possible to break by finding the length of the key. In autokey the key has the same 

length as the plaintext. Still, because it is related to the plaintext statistical techniques can be still applied. 

Ideally, the key should be of the same length as the plaintext but completely unrelated. This is done in the 

One-time pad cipher. 

One-time Pad 

n ≥ 1, P = C = K = (Z 2 ) n 

encryption: e K (x) = (x 1 +K 1 ,...,x n +K n ) mod 2 

decryption: d K (y) = (y 1 +K 1 ,...,y n +K n ) mod 2 

□ 

- advantage: Theorem 3.6 implies that One-time Pad is perfectly secure 

- disadvantages: 

- the key (which has to be securely communicated) is as least as big as the plaintext 

- each key can be used only once 

- vulnerable against know-plaintext attack 

- severe key management problems; not commercially used but diplomatically and military 

- much used for the Moscow-Washington hot-line 

- much used by the Russian agents operating in foreign countries 

Invented in 1918 (by Vernam), it was thought to be unbreakable for many years (intuitively!) until Shannon 

proved it unbreakable only in 1949. (See next chapter for proof.) The idea behind this is that, due to independence 

of the key, the ciphertext can be decrypted into anything! See the example below; notice that we work 

over Z 27 . 

Example 2.14. 

ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS 

pxlmvmsydoftyrvzwc tnlebnecvgdupahfzzlmnyih 

mr mustard with the candlestick in the hall 

ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS 

pftgpmaydgaxgoufhklllmhsqdqogtewbqfgyovuhwt 

miss scarlet with the knife in the library 

□


3 PERFECT SECRECY 

3.1 Probability theory 

- recall that unconditional security assumes the cryptanalyst has infinite computational resources 

- we need probabilities to study unconditional security 

notations 

- X and Y discrete random variables 

- Prob(x) = Prob(X = x) – the probability that X takes value x 

- Prob(y) = Prob(Y = y) – the probability that Y takes value y 

- Prob(x,y) – joint probability – the probability that X takes value x and Y takes value y 

- Prob(x|y) – conditional probability – the probability that X takes value x given that Y takes value y 

- X and Y are independent if Prob(x,y) = Prob(x)Prob(y), for all x,y 

- Prob(x,y) = Prob(x|y)Prob(y) = Prob(y|x)Prob(x) 

Theorem 3.1 (Bayes’ Theorem). If Prob(y) > 0, then Prob(x|y) = Prob(y|x)Prob(x) . 

Prob(y) 

Corollary 3.2. X and Y are independent iff Prob(x|y) = Prob(x), for all x,y. 

Example 3.3. Consider a random throw of a pair of dice. Let X be a random variable for the sum of the two 

dice and Y which takes the value D if the two dice are the same and N otherwise. The probability distributions 

for X and Y are shown below: 

x 2 3 4 5 6 7 8 9 10 11 12 

Prob(X = x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 

Two conditional probabilities are computed below: 

y D N 

Prob(Y = y) 6/36 30/36 

Prob(D|4)(= Prob(Y = D|X = 4)) = 1/3 Prob(4|D)(= Prob(X = 4|Y = D)) = 1/6 

and so 

Prob(4,D) = 1/36 = Prob(D|4)Prob(4) = Prob(4|D)Prob(D) 

□ 

3.2 Perfect secrecy 

notations – assume a cryptosystem (P,C,K,E,D) 

- Prob(x = x) – the (a priori) probability that the plaintext is x 

- Prob(K = K) – the probability that key K is chosen 

assumption: K and x are independent random variables 

- Prob(y = y) – the probability that the ciphertext is y 

- C(K) = {e K (x) | x ∈ P} – all ciphertexts obtained using K 

We have 

∑ 

Prob(y = y) = Prob(K = K)Prob(x = d K (y)) 

Also 

{K|y∈C(K)} 

Prob(y = y|x = x) = 

∑ 

{K|x=d K(y)} 

Prob(K = K)


We can now use Bayes’ theorem to compute the probability of a plaintext conditioned by a given ciphertext as 

∑ 

Prob(x = x) Prob(K = K) 

Prob(x = x|y = y) = ∑ 

{K|y∈C(K)} 

{K|x=d K(y)} 

Prob(K = K)Prob(x = d K (y)) 

Example 3.4. Consider a cipher with P = {a,b}, C = {1,2,3,4}, K = {K 1 ,K 2 ,K 3 } with the distributions 

x a b 

Prob(x = x) 1/4 3/4 

K K 1 K 2 K 3 

Prob(K = K) 1/2 1/4 1/4 

and the encryption mapping 

We can compute then the following probabilities 

e a b 

K 1 1 2 

K 2 2 3 

K 3 3 4 

y 1 2 3 4 

Prob(y = y) 1/8 7/16 1/4 3/16 

Prob(x = x|y = y) 1 2 3 4 

a 1 1/7 1/4 0 

b 0 6/7 3/4 1 

Prob(y = y|x = x) a b 

1 1/2 0 

2 1/4 1/2 

3 1/4 1/4 

4 0 1/4 

□ 

A cryptosystem has perfect secrecy if Prob(x = x|y = y) = Prob(x = x), for all x,y, that is, the 

(a posteriori) probability that the plaintext is x given y as ciphertext is always the same as the (a priori) 

probability that the plaintext is x. Put otherwise, y gives no information about x. 

Notice that this is equivalent, by Bayes’ theorem, to Prob(y = y|x = x) = Prob(y = y), for all x,y, 

Theorem 3.5. Assume the Shift Cipher such that each character is encrypted using a new random equally 

probable key (of probability 1/26). Then, for any plaintext distribution, the Shift Cipher has perfect secrecy. 

Proof. Recall that P = C = K = Z 26 and e K (x) = x+K mod 26. For any ciphertext y, we have 

Prob(y = y) = ∑ 

Prob(K = K)Prob(x = d K (y)) 

K∈Z 26 

= ∑ 1 

Prob(x = y −K) 

26 

K∈Z 26 

= 1 ∑ 

Prob(x = y −K) 

26 

K∈Z 26 

= 1 ∑ 

Prob(x = y) 

26 

y∈Z 26 

= 1 

26 .


We have also 

Prob(y = y|x = x) = Prob(K = y −x mod 26) = 1 

26 

and so the Shift Cipher (with a new random equally probably key for each letter) has perfect secrecy. □ 

Assume, for any y, Prob(y = y) > 0 (otherwise we can remove y from C). For a fixed x ∈ P, if the 

cryptosystem has perfect secrecy, we have Prob(y = y|x = x) = Prob(y = y) > 0. Thus, there must be K ∈ K 

such that e K (x) = y. It follows that |K| ≥ |C|. Encryption is injective, so also |C| ≥ |P|. 

Theorem 3.6 (Shannon). If |P| = |C| = |K|, then the cryptosystem has perfect secrecy iff 

(i) all keys are used with the same probability 

(ii) for every x ∈ P and y ∈ C, there is a unique K ∈ K such that e K (x) = y. 

Proof. Assume first the cryptosystem perfectly secure. 

(ii) We showed above that, for any x ∈ P, y ∈ C, there is at least one key K ∈ K such that e K (x) = y. But 

|K| = |C|, which gives that there is exactly one such key. 

(i) Fix y ∈ C and put P = {x 1 ,...,x n }. We can denote the keys by {K 1 ,...,K n } such that e Ki (x i ) = y, 

1 ≤ i ≤ |P|. We have then, using perfect secrecy, Prob(K = K i ) = Prob(y = y|x = x i ) = Prob(y = y), for all 

i. This means all keys are used with the same probability Prob(y). 

The converse implication is proved as Theorem 3.5. 

□ 

Corollary 3.7. One-time pad is perfectly secure.


4 DATA ENCRYPTION STANDARD 

DES is the most widely used cryptosystem. It encrypts blocks of 64 bits (into output blocks of 64 bits) using a 

56-bit key. 

4.1 History 

- 1960s – IBM’s Feistel designed Lucifer – Feistel block cipher which operates on blocks of 64 bits using a 

128-bit key 

- 1973 – NBS issued a request for proposals for a national cipher standard 

- an improved Lucifer (by IBM and NSA) was submitted – 56-bit key (required by NSA) – this was DES 

- much criticism 

- key too short for a brute force attack 

- design criteria for the S-boxes were not public 

- 1994 - NIST recommended DES for applications other than protection of classified information 

- 1999 - NIST recommended only triple DES (two or three DES keys) 

4.2 Feistel ciphers 

The methods for breaking cipher we presented were based on statistical analysis. Monoalphabetic system were 

easy to break because statistics worked very well at the level of letter. Polyalphabetic ones were also possible 

to break because we could still use statistics. In Fig. 2.6 we can see how the frequency of letters changes from 

plaintexts to ciphertexts encrypted using various cryptosystems. Except for a random polyalphabetic cipher, 

any of the others still had some information left in the ciphertext about the plaintext. Ideally, no information 

about the plaintext or key should be revealed by the ciphertext. This is done in the one-time pad cipher but 

then the length of the key is impracticable. To 

achieve a similar effect (hopefully!) with a much 

smaller key, we use block ciphers (which, as we shall 

see, can be used to simulate stream ciphers, so are 

more general) with repeated stages. The essential 

idea goes back to Feistel-type ciphers. 

In principle, we could use a mapping which maps 

blocks of n bits into blocks of n bits. But then the 

size of the key would be proportional to 2 n which 

would make it unpractical. To thwart statistical attack, 

blocks of 64 bits should be used, which would 

make the key size approximately 10 19 . therefore, we 

need another way to achieve similar effects. We are 

back to Feistel’s idea which we describe in this section. 

Before that, we discuss little bit about two basic principles for preventing statistical cryptanalysis: diffusion 

and confusion, suggested by Shannon. Diffusion means that the statistical structure of the plaintext should 

be dissipated into long range statistics of the ciphertext. For instance, each bit of the plaintext should affect 

the value of many ciphertext bits or, equivalently, each bit of the ciphertext is affected by many bits of the 

plaintext. So diffusion tries to make the statistical relation between plaintext and ciphertext as complex as 

possible. Diffusion is achieved by repeated permutation. 

Confusion tries to make the relationship between the statistics of the ciphertext and the key as complex as 

possible. Confusion is achieved by complex substitutions. 

The basic structure of a Feistel cipher is depicted in Fig. 3.5. It is a particular form of the substitutionpermutation 

network proposed by Shannon. We have a number of rounds consisting of 

- a substitution on the left half of data; a round function F is applied to the right half and the result is xored 

with the left half; in each round F depends on some subkey K i 

- a permutation; the two halfs are interchanged


The important parameters of a Feistel cipher are: 

- block size – the larger the better; 64 is good enough; AES uses 128 

- key size – larger increases security but lowers speed; 64 is no longer good; 128 is common size 

- number of rounds – essential against more advanced attacks; typical size is 16 

- subkey generation algorithm – complex 

- round function – complex 

The encryption and decryption algorithms are basically the same with the difference that the subkeys for 

the decryption algorithm will be used in the reversed order; see Fig. 3.6. 

We show next that the decryption works as intended. With the notations in Fig. 3.6 we have, for all i, 

LE i = RE i−1 

RE i = LE i−1 ⊕F(RE i−1 ,K i ) 

LD i = RD i−1 

RD i = LD i−1 ⊕F(RD i−1 ,K 17−i ) 

We show by induction on i that 

LD i = RE 16−i 

RD i = LE 16−i 

In particular, for i = 16 we get that decryption gives back the plaintext. The equalities hold for i = 0. We 

assume they hold for i ad prove them for i+1. We use the facts that ⊕ is associative, has 0 as identity, and 

each element is its own inverse (x⊕x = 0). We have 

LD i+1 = RD i = LE 16−i = RE 16−(i+1) 

and 

RD i+1 = LD i ⊕F(RD i ,K 16−i ) 

= RE 16−i ⊕F(LE 16−i ,K 16−i ) 

= LE 15−i ⊕F(RE 15−i ,K 16−i )⊕F(RE 15−i ,K 16−i ) 

= LE 16−(i+1)


It is very important to notice that we did not assume anything on the function F. In particular, it need not be 

reversible. 

4.3 Description of DES 

The overall DES encryption algorithm is shown in Fig. 3.7. It encrypts 64-bit plaintext blocks using a 56-bit 

key. The details of each round are shown in Fig. 3.8 and the computation of F is shown in Fig. 3.9.


4.4 Analysis of DES 

Two points were criticized: 

-S-boxes; asthe onlynonlinearpart, they arevitalto security. It wassuggestedthat they containtrapdoors 

which would allow NSA to decrypt. The evidence so far shows that the S-boxes were built to resist certain 

advanced attacks, such as differential cryptanalysis which was known to NSA 20 yearsbefore Biham and Shamir 

rediscovered it in 1991. As we shall see later, a differential cryptanalysis attack on (16 round) DES requires 

2 55.1 operations compared to 2 55 needed by brute force attack. If DES had fewer rounds, then differential 

cryptanalysis would require less effort than brute force attack. 

- key size; the original Lucifer had 128; the proposed DES had 64 which was reduced to 56 to include 8 

parity check bits. 

- 1977 – Diffie and Hellman estimated to $20,000,000 a machine to break DES in one day 

- 1993 – Wiener estimated to $100,000 a machine to break DES in 1.5 days 

- 1998 – a $250,000 machine was built by the Electronic Frontier Foundation and broke DES in 56 hours. 

- 1999 – a worldwide net broke DES in 22h 15min 

We mention further that linear cryptanalysis is more efficient than differential cryptanalysis – DES was 

broken using 2 43 plaintext-ciphertext pairs. (Of course, in practice such an attempt is not likely to succeed due 

to the huge number of pairs required.) 

4.5 Modes of operation 

- electronic codebook mode (ECB) (Fig. 3.11) 

- for a given key, there is a unique ciphertext for every 64-bit input 

- good for short messages, such as a DES key 

- not good for long messages due to its regularity 

- cipher block chaining mode (CBC) (Fig. 3.12) 

- the same block of plaintext will produce a different ciphertext 

- an initial vector IV is used for the first ciphertext block; IV must be secretly known by both parties; it can 

be sent using ECB 

- if IV is revealed, then problems might appear; for instance, C 1 = E k (IV ⊕P 1 ) implies P 1 = IV ⊕D k (C 1 ) 

and so corresponding bits of P 1 and IV can be simultaneously complemented. 

- cipher feedback mode (CFB) (Fig. 3.13) 

- this is a stream cipher 

- ciphertext fed back to the shift register


- plaintext divided into blocks of s bits 

- operates in real time 

- good for authentication 

- notice the use of encryption function only


- output feedback mode (OFB) (Fig. 3.14) 

- similar; the output of the encryption is fed back to register – bit errors in transmission do not propagate 

(used for satellite transmissions) 

- more vulnerable to message stream modification attack than CFB 

- counter mode (CTR) (Fig. 3.15) 

- most recent 

- a counter is used; must be different for each encrypted block; usually the counter in incremented by 1 mod 

block size 

- advantages 

- hardware and software efficiency – can be done in parallel 

- preprocessing possible 

- random access in ciphertext possible 

- does not require the decryption function implemented 

4.6 Triple DES 

DES is no longer safe. We can build new ciphers or try to use DES in a safe way. 

- double DES uses two DES keys; see Fig. 6.1. We have 

C = E K2 (E K1 (P)) 

P = D K1 (D K2 (C)). 

It is very likely that the double DES cannot be simulated by a single DES, that is, it produces a different 

mapping. So, we should have an increase to a key of 112 bits. 

- man-in-the-middle attack 

- we have E K1 (P) = D K2 (C) 

- so, given a pair (P,C) we encrypt P using all possible 2 56 values for K 1 and store those in a table 

- then decrypt C using all possible 2 56 for K 2 and match those against the ones in the table 

- when a match occurs, test the pair of keys against a different pair plaintext-ciphertext


- each plaintext is encrypted by double DES in one of 2 64 possible ciphertexts; since there 2 112 keys, on 

average a plaintext P is encrypted to a ciphertext C by 2 48 keys 

- so, for the first pair a match will produce a false alarm with probability 1−2 −48 

- a false alarm for both pairs will be produced with very small probability: 2 −16 = 2 48−64 . 

- so double DES is not much more secure than DES 

- triple DES (3DES) performs three stages of encryption using two keys; see Fig. 6.1. We have 

C = E K1 (D K2 (E K1 (P))) 

P = D K1 (E K2 (D K1 (C))) 

The only use of the decryption in the middle is to allow users of 3DES to decrypt single DES 

- no known effective attacks 

- one can use also 3DES with three keys 

C = E K (P) = E K (D K (E K (P))).


5 LINEAR AND DIFFERENTIAL CRYPTANALYSIS 

These are the most powerful attacks against symmetric block ciphers. In this section we describe the two 

attacks. They are very complex and we shall describe them on a simple model called substitution-permutation 

network. 

5.1 Iterated ciphers 

A common used design in most modern-day block ciphers is that of an iterated cipher. 

An iterated cipher consists of a round function and a key schedule. Given a key K (usually a random binary 

key of specified length), we construct the key schedule (K 1 ,K 2 ,...,K Nr ) using a fixed public algorithm; the 

components K r are called round keys. The round function, say g, takes two inputs: a round key K r and a 

current state of the plaintext being encrypted and produces the next state. The initial state is the plaintext 

and the last state will be the ciphertext. Therefore, the encryption algorithm looks as below: 

← x 

w 1 ← g(w 0 ,K 1 ) 

w 2 ← g(w 1 ,K 2 ) 

. 

. 

. 

. 

. 

. 

w Nr−1 ← g(w Nr−2 ,K Nr−1 ) 

w Nr ← g(w Nr−1 ,K Nr ) 

y ← w Nr 

w 0 

In order for the decryption to be possible, g has to be injective when its second argument is fixed; that is, 

there exists g −1 such that 

g −1 (g(w,k),k) = w, 

for all w and k. In this case the decryption is done by a similar algorithm: 

← y 

w Nr−1 ← g −1 (w Nr ,K Nr ) 

w Nr 

5.2 Substitution-permutation network 

. . . 

w 1 ← g −1 (w 2 ,K 2 ) 

w 0 ← g −1 (w 1 ,K 1 ) 

x ← w 0 

A substitution-permutation network (SPN) is a special type of iterated cipher with few changes. Given l and 

m two positive integers (lm will be the block length of the cipher), an SPN is built from two components: a 

substitution (which is technically a permutation) 

and a permutation 

π S : {0,1} l → {0,1} l 

π P : {1,2,...,lm} → {1,2,...,lm}. 

π S is called an S-box (‘S’ comes from “substitution”) and will be used to replace l bits with a different set of l 

bits. π P will be used to permute lm bits. 

Given an lm-bit binary string x = (x 1 ,x 2 ,...,x lm ) we regard x as a concatenation of m l-bit substrings 

x (1) ,x (2) ,...,x (m) . That is 

x = x (1) ‖x (2) ‖···‖x (m) 

where, for each 1 ≤ i ≤ m, we have 

x (i) = (x (i−1)l+1 ,...,x il ).


Substitution-permutation network 

P = C = {0,1} lm , K ⊆ ({0,1} lm ) Nr+1 

encryption: Nr rounds each (except the last) including: 

- xor with a round key (round key mixing) 

- a substitution using π S 

- a permutation using π P 

SPN(x,π S ,π P ,(K 1 ,K 2 ,...,K Nr+1 )) 

1. w 0 ← x 

2. for r from 1 to Nr−1 do 

3. u r ← w r−1 ⊕K r 

4. for i from 1 to m do 

5. v(i) r ← π S(u r (i) ) 

6. w r ← (vπ r P(1) ,...,vr π ) P(lm) 

7. u Nr ← w Nr−1 ⊕K Nr 

8. for i from 1 to m do 

9. v(i) Nr S(u Nr 

(i) ) 

10. y ← v Nr ⊕K Nr+1 

11. return y 

decryption: similar with encryption just that 

- the S-boxes are replaced by their inverses and 

- the key schedule is reversed. 

Example 5.1. Assume l = m = Nr = 4 and π S and π P defined as below (in the definition of π S each 4-tuple 

of bits is represented in hexadecimal): 

z 0 1 2 3 4 5 6 7 8 9 A B C D E F 

π S (z) E 4 D 1 2 F B 8 3 A 6 C 5 9 0 7 

z 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

π P (z) 1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16 

This SPN is also shown in Fig. 6 where the S-boxes have different numbers for easier reference. They represent 

the same S-box namely π S . 

The description of our SPN is completed by specifying the key scheduling algorithm. Here is a simple 

possibility. We start with a 32-bit key K = (k 1 ,...,k 32 ) ∈ {0,1} 32 . For 1 ≤ r ≤ 5, define K r to contain the 16 

consecutive bits starting with k 4r−3 . For instance, if 

K = 0011 1010 1001 0100 1101 0110 0011 1111, 

then the round keys are 

If the plaintext is 

K 1 = 0011 1010 1001 0100 

K 2 = 1010 1001 0100 1101 

K 3 = 1001 0100 1101 0110 

K 4 = 0100 1101 0110 0011 

K 5 = 1101 0110 0011 1111 

x = 0010 0110 1011 0111


plaintext 

x 

⊕K 1 

u 1 

S1 1 S2 

1 S3 

1 

S 1 4 

round 1 

v 1 

w 1 

⊕K 2 

u 2 

S 2 1 

S 2 2 

S 2 3 

S 2 4 

round 2 

v 2 

w 2 

⊕K 3 

u 3 

S 3 1 

S 3 2 

S 3 3 

S 3 4 

round 3 

v 3 

w 3 

u 4 

⊕K 4 

round 4 

S 4 1 

S 4 2 

S 4 3 

S 4 4 

v 4 

⊕K 5 

y 

ciphertext 

Figure 6: A substitution-permutation network 

then the encryption proceeds as follows: 

w 0 = 0010 0110 1011 0111 

K 1 = 0011 1010 1001 0100 

u 1 = 0001 1100 0010 0011 

v 1 = 0100 0101 1101 0001 

w 1 = 0010 1110 0000 0111 

K 2 = 1010 1001 0100 1101 

u 2 = 1000 0111 0100 1010 

v 2 = 0011 1000 0010 0110 

w 2 = 0100 0001 1011 1000 

K 3 = 1001 0100 1101 0110 

u 3 = 1101 0101 0110 1110 

v 3 = 1001 1111 1011 0000 

w 3 = 1110 0100 0110 1110 

K 4 = 0100 1101 0110 0011 

u 4 = 1010 1001 0000 1101 

v 4 = 0110 1010 1110 1001 

K 5 = 1101 0110 0011 1111 

y = 1011 1100 1101 0110


Comments on SPNs: 

• design is simple and efficient in both software and hardware 

• in software, an S-box is implemented as a look-up table; memory required is l2 l bits; in Example 5.1 each 

S-box requires 2 6 bits; AES uses an S-box which maps 8 bits to 8 bits (key size at least 128 bits, block 

length 128, at least 10 rounds) 

• it is possible to use more than one S-box; DES uses eight different S-boxes in each round 

• in each round an invertible linear transformation can be included as a replacement or in addition to the 

permutation operation; this is done in AES 

5.3 Linear cryptanalysis 

We start by describing the basic idea which can be applied, in principle, to any iterated cipher. Suppose it 

is possible to find a probabilistic linear relationship between a subset of plaintext bits and a subset of state 

bits immediately preceding the substitutions performed in the last round. In other words, there exists a subset 

of bits whose xor behaves in a non-random fashion; that is, it takes on the value 0 (or 1) with a probability 

bounded away from 1/2. Now assume the attacker has a large number of plaintext-ciphertext pairs, all of which 

are encrypted with the same unknown key K; i.e., we have a known plaintext attack. For each of the plaintextciphertext 

pairs, we will begin to decrypt the ciphertext, using all possible candidate keys for the last round 

of the cipher. For each candidate key, we compute the values of the relevant state bits involved in the linear 

relationship and determine if the above mentioned linear relationship holds. Whenever it does, we increment a 

counter corresponding to the particular candidate key. At the end of the process we hope that the candidate 

key that has a frequency count that is furthest from 1/2 times the number of pairs contains the correct values 

for the key bits involved. 

5.3.1 The piling-up lemma 

Consider X i , i = 1,2,3,... independent random variables taking values from {0,1} and suppose that 

The independence of X i and X j implies 

Prob[X i = 0] = p i . 

Prob[X i ⊕X j = 0] = p i p j +(1−p i )(1−p j ), 

□ 

The bias of X i is 

Prob[X i ⊕X j = 1] = p i (1−p j )+(1−p i )p j . 

ǫ i = p i − 1 2 . 

Notice that −1/2 ≤ ǫ i ≤ 1/2, Prob[X i = 0] = 1/2+ǫ i , and Prob[X i = 1] = 1/2−ǫ i . 

For i 1 

denote the bias of the random variable X i1 ⊕X i2 ⊕···⊕X ik . 

Lemma 5.2 (Piling-up lemma). If ǫ i1,i 2,...,i k 

is the bias of the random variable X i1 ⊕X i2 ⊕···⊕X ik , then 

∏ 

k 

ǫ i1,i 2,...,i k 

= 2 k−1 ǫ ij . 

Corollary 5.3. If ǫ i1,i 2,...,i k 

is the bias of the random variable X i1 ⊕X i2 ⊕···⊕X ik and ǫ ij = 0 for some j, 

then ǫ i1,i 2,...,i k 

= 0. 

It is important to notice that the piling-up lemma holds, in general, only when the random variables are 

independent. As an example, consider independent X 1 ,X 2 ,X 3 with ǫ i = 1/4 for all i. With piling lemma we get 

ǫ 1,2 = ǫ 1,3 = ǫ 2,3 = 1/8. Considering the two variables X 1 ⊕X 2 and X 2 ⊕X 3 . We have (X 1 ⊕X 2 )⊕(X 2 ⊕X 3 ) = 

X 1 ⊕X 3 . If X 1 ⊕X 2 and X 2 ⊕X 3 were independent we would have ǫ 1,3 = 2(1/8) 2 = 1/32. But ǫ 1,3 = 1/8. 

j=1


5.4 Linear approximation of S-boxes 

Consider a general S-box π S : {0,1} m → {0,1} n ; notice that we do not require that m = n. An input is 

X = (X 1 ,...,X m ), where each x i defines a random variable X i taking on values 0 and 1 and having bias ǫ i = 0; 

these variables are independent. 

The output is Y = (y 1 ,...,y n ) and each y i defines a variable Y i . Clearly, these variables are not independent 

from each other and from the X i ’s. 

Next, we compute the bias of variables of the form 

X i1 ⊕···⊕X ik ⊕Y j1 ⊕···⊕Y jl . 

A linear cryptanalytic attack can be potentially mounted when a random variable of this form has a bias that 

is bounded away from 0. 

Example 5.4. Forthe S-boxinExample5.1, wecomputeallpossiblevaluestakenbytheeightrandomvariables 

X 1 ,...,X 4 ,Y 1 ,...,Y 4 in the table below. 

X 1 X 2 X 3 X 4 Y 1 Y 2 Y 3 Y 4 X 1 ⊕X 4 ⊕Y 2 X 3 ⊕X 4 ⊕Y 1 ⊕Y 4 

0 0 0 0 1 1 1 0 1 1 

0 0 0 1 0 1 0 0 0 1 

0 0 1 0 1 1 0 1 1 1 

0 0 1 1 0 0 0 1 1 1 

0 1 0 0 0 0 1 0 0 0 

0 1 0 1 1 1 1 1 0 1 

0 1 1 0 1 0 1 1 0 1 

0 1 1 1 1 0 0 0 1 1 

1 0 0 0 0 0 1 1 1 1 

1 0 0 1 1 0 1 0 0 0 

1 0 1 0 0 1 1 0 0 1 

1 0 1 1 1 1 0 0 1 1 

1 1 0 0 0 1 0 1 0 1 

1 1 0 1 1 0 0 1 0 1 

1 1 1 0 0 0 0 0 1 1 

1 1 1 1 0 1 1 1 1 1 

If we consider now the random variable X 1 ⊕X 4 ⊕Y 2 , the bias of this variable is 0 as seen in the table above. 

So, it is not suitable for a linear cryptanalytic attack. On the other hand, the random variable X 3 ⊕X 4 ⊕Y 1 ⊕Y 4 

has bias −3/8, see the above table. 

□ 

We next compute the biases of all 2 8 = 256 random variables of this form. We represent each such random 

variable in the form 

( 4⊕ ( 4⊕ ) 

a i X i 

)⊕ b i Y i 

i=1 

where a i ,b i ∈ {0,1}. We then treat each 4-tuple a = (a 1 ,a 2 ,a 3 ,a 4 ) and b = (b 1 ,b 2 ,b 3 ,b 4 ) as a hexadecimal 

digit; the former is called input sum and the latter is called output sum. We denote by N L (a,b) the number of 

binary 8-tuples (x 1 ,x 2 ,x 3 ,x 4 ,y 1 ,y 2 ,y 3 ,y 4 ) such that 


i=1 

π s (x 1 ,x 2 ,x 3 ,x 4 ) = (y 1 ,y 2 ,y 3 ,y 4 ) 

( 4⊕ ( 4⊕ ) 

a i x i 

)⊕ b i y i = 0. 

Notice that the bias of a random variable having input sum a and output sum b is 

i=1 

i=1 

ǫ(a,b) = N L(a,b)−8 

. 

16


The table containing all values of N L is called the linear approximation table. For our example, it is shown in 

Fig. 7. 

N L (a,b) 

a 

(input 

sum) 

b (output sum) 

0 1 2 3 4 5 6 7 8 9 A B C D E F 

0 16 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 

1 8 8 6 6 8 8 6 14 10 10 8 8 10 10 8 8 

2 8 8 6 6 8 8 6 6 8 8 10 10 8 8 2 10 

3 8 8 8 8 8 8 8 8 10 2 6 6 10 10 6 6 

4 8 10 8 6 6 4 6 8 8 6 8 10 10 4 10 8 

5 8 6 6 8 6 8 12 10 6 8 4 10 8 6 6 8 

6 8 10 6 12 10 8 8 10 8 6 10 12 6 8 8 6 

7 8 6 8 10 10 4 10 8 6 8 10 8 12 10 8 10 

8 8 8 8 8 8 8 8 8 6 10 10 6 10 6 6 2 

9 8 8 6 6 8 8 6 6 4 8 6 10 8 12 10 6 

A 8 12 6 10 4 8 10 6 10 10 8 8 10 10 8 8 

B 8 12 8 4 12 8 12 8 8 8 8 8 8 8 8 8 

C 8 6 12 6 6 8 10 8 10 8 10 12 8 10 8 6 

D 8 10 10 8 6 12 8 10 4 6 10 8 10 8 8 10 

E 8 10 10 8 6 4 8 10 6 8 8 6 4 10 6 8 

F 8 6 4 6 6 8 10 8 8 6 12 6 6 8 10 8 

Figure 7: A linear approximation table 

5.5 A linear attack on SPN 

Linear cryptanalysis requires finding a set of linear approximations of S-boxes that can be used to derive a 

linear approximation of the entire SPN (excluding the last round). 

We will illustrate the procedure using the SPN in Example 5.1. The attack is also shown in Fig. 8; thick 

lines correspond to random variables which are involved in the linear approximations and the labelled S-boxes 

are the ones involved in the approximations – they are called active S-boxes. 

The approximation incorporates four active S-boxes: 

• in S 1 2: the random variable T 1 = U 1 5 ⊕U 1 7 ⊕U 1 8 ⊕V 1 

6 has bias 1/4; 

• in S 2 2: the random variable T 2 = U 2 6 ⊕V 2 

6 ⊕V 2 

8 has bias −1/4; 


6 ⊕V 3 

8 has bias −1/4; 


14 ⊕V 3 

16 has bias −1/4; 

The four random variables T i have biases which are high in absolute value. Also, their xor will lead to 

cancellations of intermediate random variables. 

If we make the assumption that these four random variables are independent, then we can compute the bias 

of their xor 

T 1 ⊕T 2 ⊕T 3 ⊕T 4 

using the piling lemma. (Actually, these variables are not independent, which means that piling lemma will not 

give the correct result. However, it gives in practice a reasonably good approximation which works well for our 

attack.) Therefore, by piling lemma, we hypothesize that the random variable T 1 ⊕T 2 ⊕T 3 ⊕T 4 has bias −1/32.


X 7 X 8 

x 

K5 1 K7 

1 K8 

1 

⊕K 1 

u 1 

S 1 2 

v 1 

w 1 

K 2 6 

⊕K 2 

u 2 

S 2 2 

v 2 

w 2 

K 3 6 

K 3 14 

⊕K 3 

u 3 

v 3 

S 3 2 

S 3 4 

w 3 

K 4 6 

K 4 8 

K 4 14 

K 4 16 

⊕K 4 

u 4 

U 4 6 

U 4 8 

X 5 

U 4 16 

U 4 14 

v 4 

y 

K 5 5 ...K5 8 K 5 13 ...K5 16 

⊕K 5 

Figure 8: A linear approximation of a substitution-permutation network 

Next, we can write (see Fig. 8) 

The xor T 1 ⊕T 2 ⊕T 3 ⊕T 4 becomes 

T 1 = X 5 ⊕K5 1 ⊕X 7 ⊕K7 1 ⊕X 8 ⊕K8 1 ⊕V1 6 

T 2 = V6 1 ⊕K2 6 ⊕V2 6 ⊕V2 8 

T 3 = V6 2 ⊕K6 3 ⊕U6 4 ⊕K6 4 ⊕U14 4 ⊕K14 

4 

T 4 = V8 2 ⊕K3 14 ⊕U4 8 ⊕K4 8 ⊕U4 16 ⊕K4 16 

X 5 ⊕X 7 ⊕X 8 ⊕U 4 6 ⊕U 4 8 ⊕U 4 14 ⊕U 4 16 ⊕K 1 5 ⊕K 1 7 ⊕K 1 8 ⊕K 2 6 ⊕K 3 6 ⊕K 3 14 ⊕K 4 6 ⊕K 4 8 ⊕K 4 14 ⊕K 4 16 

and so the last random variable had also bias (approximately) −1/32. It involves only bits of plaintext, of u 4 , 

and of the key. Suppose that the key bits are fixed. Then the random variable 

K 1 5 ⊕K 1 7 ⊕K 1 8 ⊕K 2 6 ⊕K 3 6 ⊕K 3 14 ⊕K 4 6 ⊕K 4 8 ⊕K 4 14 ⊕K 4 16 

has a fixed value, 0 or 1. Therefore, the random variable 

X 5 ⊕X 7 ⊕X 8 ⊕U 4 6 ⊕U4 8 ⊕U4 14 ⊕U4 16


has bias ±1/32 (approximately), depending on the values of the key bits. This bias will allow us to carry the 

linear attack. 

Assume we have N l plaintext-ciphertext pairs, all using the same unknown key K. The attack will allow us 

to obtain the key bits 

K 5 5 ,K5 6 ,K5 7 ,K5 8 ,K5 13 ,K5 14 ,K5 15 ,K5 16 , 

that is, the eight key bits that are xored with the output of the S-boxes S2 4 and S4. 4 (They correspond to the 

bits of u 4 involved in our linear relation.) 

There are 2 8 = 256 possibilities for these eight bits. Any binary 8-tuple containing values for these eight 

key bits will be called a candidate subkey. 

For each pair (x,y) of plaintext-ciphertext and each candidate subkey, we compute a partial decryption of 

y to obtain the resulting values for u 4 (2) and u4 (4). Then we compute the value 

x 5 ⊕x 7 ⊕x 8 ⊕u 4 6 ⊕u4 8 ⊕u4 14 ⊕u4 16 . 

Wemaintainanarrayofcountersindexedbythe256candidatesubkeysandincrementthecountercorresponding 

to a particular subkey whenever the previous result is 0. 

At the end, we expect most counters to be close to N l /2 but the counter for the correct candidate key will 

be close to N l /2±N l /32. This will hopefully allow us to identify the correct subkey. 

For our example, some partial results for the counters corresponding to the candidate subkeys are shown in 

the table below; there N l = 10000 and |bias| = |count−5000|/10000. Notice that the value corresponding to 

the subkey (2,4) hex has the corresponding value 0.0336 very close to the expected 1/32 = 0.03125. 

5.6 Complexity of attack 

candidate subkey |bias| 

(K5 5,...,K5 8 ,K5 13 ,...,K5 16 ) 

1 C 0.0031 

1 D 0.0078 

1 E 0.0071 

1 F 0.0170 

2 0 0.0025 

2 1 0.0220 

2 2 0.0211 

2 3 0.0064 

2 4 0.0336 

2 5 0.0106 

2 6 0.0096 

2 7 0.0074 

2 8 0.0224 

2 9 0.0054 

2 A 0.0044 

2 B 0.0186 

2 C 0.0094 

Let ǫ denote the bias of the probability that the linear expression for the complete cipher holds. The number 

N l of known plaintext-ciphertext required is approximated to be 

N l ≈ 1/ǫ 2 . 

In practice N l is a small multiple of 1/ǫ 2 . In our example, N l was about ten times ǫ 2 .


5.7 Differential cryptanalysis 

Differential cryptanalysis is similar to linear cryptanalysis in many respects. The main difference is that differential 

cryptanalysis involves comparing the xor of two inputs to the xor of the corresponding two outputs. 

We will be looking at (binary) inputs x and x ∗ and denote their xor by x ′ = x⊕x ∗ . 

Differential cryptanalysis is chosen plaintext attack. We assume that the attacker has a large number of 

4-tuples (x,x ∗ ,y,y ∗ ) where the xor value x ′ = x⊕x ∗ is fixed. The plaintexts x and x ∗ are encrypted using the 

same unknown key K, yielding the ciphertexts y and y ∗ , resp. For each such 4-tuple, we will begin to decrypt 

the ciphertexts y and y ∗ using all possible candidate keys for the last round of the cipher. For each candidate 

key, we compute the values of certain state bits and determine if their xor has the value which is most likely for 

the given input xor. Whenever it does, we increment a counter corresponding to the particular candidate key. 

At the end, we hope that the candidate key with the highest frequency count contains the right values for the 

key bits involved. 

Let π S : {0,1} m → {0,1} n be an S-box. For a pair of m-bit strings (x,x ∗ ), we say that x⊕x ∗ is the input 

xor of the S-box and π S (x)⊕π S (x ∗ ) is the output xor of the S-box. For an m-bit string x ′ , we denote by ∆(x ′ ) 

the set of all pairs (x,x ∗ ) with input xor equal to x ′ . It is easy to see that ∆(x ′ ) contains 2 m pairs and 

∆(x ′ ) = {(x,x⊕x ′ ) | x ∈ {0,1} m }. 

For each pair in ∆(x ′ ) (i.e., the same input xor) we compute the output xor and then tabulate the results. 

There are 2 m output xors which are distributed among 2 n possible values. A non-uniform output distribution 

will be the basis for an attack. 

Example 5.5. For the S-box in Example 5.1, consider the input xor x ′ = 1011. The table below contains 

the values of ∆(1011) in the first two columns, and then the outputs of the S-box and, in the last column, the 

output xor. 

x x ∗ = x⊕1011 y = π S (x) y ∗ = π S (x ∗ ) y ′ = y ⊕y ∗ 

0000 1011 1110 1100 0010 

0001 1010 0100 0110 0010 

0010 1001 1101 1010 0111 

0011 1000 0001 0011 0010 

0100 1111 0010 0111 0101 

0101 1110 1111 0000 1111 

0110 1101 1011 1001 0010 

0111 1100 1000 0101 1101 

1000 0011 0011 0001 0010 

1001 0010 1010 1101 0111 

1010 0001 0110 0100 0010 

1011 0000 1100 1110 0010 

1100 0111 0101 1000 1101 

1101 0110 1001 1011 0010 

1110 0101 0000 1111 1111 

1111 0100 0111 0010 0101 

The corresponding output xor distribution is (given by the last column) 

0000 0001 0010 0011 0100 0101 0110 0111 

0 0 8 0 0 2 0 2 

1000 1001 1010 1011 1100 1101 1110 1111 

0 0 0 0 0 2 0 2 

□


We can do the same as above for all input xors. Denote, for an input xor x ′ and an output xor y ′ the number 

of the input pairs with input xor x ′ and output xor y ′ by N D (x ′ ,y ′ ), that is, 

N D (x ′ ,y ′ ) = card({(x,x ∗ ) ∈ ∆(x ′ ) | π S (x)⊕π S (x ∗ ) = y ′ }). 

All values of N D (x ′ ,y ′ ) for our example are shown in Fig. 9. 

N D (x ′ ,y ′ ) 

x ′ 

(input 

xor) 

y ′ (output xor) 

0 1 2 3 4 5 6 7 8 9 A B C D E F 

0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 

1 0 0 0 2 0 0 0 2 0 2 4 0 4 2 0 0 

2 0 0 0 2 0 6 2 2 0 2 0 0 0 0 2 0 

3 0 0 2 0 2 0 0 0 0 4 2 0 2 0 0 4 

4 0 0 0 2 0 0 6 0 0 2 0 4 2 0 0 0 

5 0 4 0 0 0 2 2 0 0 0 4 0 2 0 0 2 

6 0 0 0 4 0 4 0 0 0 0 0 0 2 2 2 2 

7 0 0 2 2 2 0 2 0 0 2 2 0 0 0 0 4 

8 0 0 0 0 0 0 2 2 0 0 0 4 0 4 2 2 

9 0 2 0 0 2 0 0 4 2 0 2 2 2 0 0 0 

A 0 2 2 0 0 0 0 0 6 0 0 2 0 0 4 0 

B 0 0 8 0 0 2 0 2 0 0 0 0 0 2 0 2 

C 0 2 0 0 2 2 2 0 0 0 0 2 0 6 0 0 

D 0 4 0 0 0 0 0 4 2 0 2 0 2 0 2 0 

E 0 0 2 4 2 0 0 0 6 0 0 0 0 0 2 0 

F 0 2 0 0 6 0 0 0 0 4 0 2 0 0 2 0 

Figure 9: A difference distribution table 

Notice next that the input xors to S-boxes is 

u r (i) ⊕(ur (i) )∗ = (w r−1 

(i) 

⊕K(i) r )⊕((wr−1 (i) )∗ ⊕K(i) r ) = wr−1 (i) 

⊕(w r−1 

(i) )∗ 

so it does not depend on the key used in the same round; it is equal to the permuted output xor of the previous 

round. 

For an input xor x ′ and an output xor y ′ , the pair (x ′ ,y ′ ) is called a differential. Each entry in the difference 

distribution table gives rise to a xor propagation ratio. For the corresponding differential, 

R p (x ′ ,y ′ ) = N D(x ′ ,y ′ ) 

2 m . 

R p (x ′ ,y ′ ) can also be interpreted as a conditional probability: 

R p (x ′ ,y ′ ) = Prob[output xor = y ′ | input xor = x ′ ]. 

The idea is to find propagation ratios for differentials in consecutive rounds of an SPN such that the input 

xor of a differential in any round is the same as the permuted output xor of the differentials in the previous 

round. Then these differentials can be combined to make a differential trail. We make the assumption that 

the propagation ratios in the differential trail are independent, which is not mathematically true in general. 

However, it is a reasonably good approximation in practice to multiply the propagation ratios (as they were 

independent) to obtain the propagation ratio of the entire trail. 

For our working SPN example, we can choose the following differentials, see Fig. 10 (the thick lines show 

the differential trail): 

• in S 1 2 : R p(1011,0010)= 1/2;


x ′ = 0000 1011 0000 0000 

x 

u 1 

S 1 2 

v 1 

w 1 

u 2 

⊕K 1 

⊕K 2 

S 2 3 

v 2 

w 2 

⊕K 3 

u 3 

S 3 2 S 3 3 

v 3 

w 3 

⊕K 4 

u 4 

v 4 

⊕K 5 

y 

Figure 10: A differential trail of a substitution-permutation network 

• in S 2 3 : R p(0100,0110)= 3/8; 

• in S 3 2 : R p(0010,0101)= 3/8; 

• in S 3 3 : R p(0010,0101)= 3/8. 

Now, the propagation ratio for this trail is: 

This means that 

with probability 27/1024. Therefore, 

with the same probability 27/1024. 

R p (0000 1011 0000 0000,0000 0101 0101 0000) = 27 

1024 . 

x ′ = 0000 1011 0000 0000 implies that (v 3 ) ′ = 0000 0101 0101 0000 

x ′ = 0000 1011 0000 0000 implies that (u 4 ) ′ = 0000 0110 0000 0110


Thealgorithmfollowsnowthe informaldescriptionatthe beginningofthissection. Somevaluesareshownin 

the table below; N d = 50004-tuples with rightinput and output xorswereused; in the table prob = count/5000. 

candidate subkey prob 

(K5 5,...,K5 8 ,K5 13 ,...,K5 16 ) 

1 C 0.0000 

1 D 0.0000 

1 E 0.0000 

1 F 0.0000 

2 0 0.0000 

2 1 0.0136 

2 2 0.0068 

2 3 0.0068 

2 4 0.0244 

2 5 0.0000 

2 6 0.0068 

2 7 0.0068 

2 8 0.0030 

2 9 0.0024 

2 A 0.0032 

2 B 0.0022 

2 C 0.0000 

Notice that the value corresponding to the subkey (2,4) hex has the corresponding value 0.0244 very close to the 

expected 27/1024≈ 0.0264. 

About the complexity of the attack, if p is the propagation ratio of the differential trail being used, then the 

number of 4-tuples required is approximated to be 

In practice, N d is a small multiple of 1/p. 

5.8 Applications to DES 

N d ≈ 1/p. 

In the case of DES, the linear cryptanalysis is the more efficient out of the two. A number of 2 43 plaintext/ciphertext 

pairs, all of which are encrypted with the same unknown key were used for a linear attack 

against DES. 

It is interesting to notice that the number of operations required to break a 16-round DES using differential 

cryptanalysis is 2 55.1 compared to 2 55 used by brute force. So, there is a very good reason behind the number 

of rounds of DES.


6 FINITE FIELDS 

6.1 Definitions 

Given a set S and a binary operation ∗, we say that S is closed under ∗ if, for any a,b ∈ S, we have a∗b ∈ S. 

We shall assume in the sequel that the sets are closed under the operations we consider. 

Example 6.1. The set {1,2,...,n} is not closed under +. 

□ 

A group is a structure (S,∗) such that 

(i) ∗ is associative: for all a,b,c ∈S, a∗(b∗c) = (a∗b)∗c 

(ii) it has identity, 1 S : for any a ∈ S, a∗1 S = 1 S ∗a = a 

(iii) each element a ∈ S has an inverse a ′ ∈ S: a∗a ′ = a ′ ∗a = 1 S 

A group (S,∗) is called abelian (commutative) if ∗ is commutative: for all a,b ∈ S, a∗b = b∗a. 

(S,∗) with (i) above is called semigroup and with (i)-(ii) is called monoid. 

Example 6.2. (Z,+) is abelian group. The set of all permutations on n elements S n = {π | π : {1,2,...,n} → 

{1,2,...,n},π bijective}withcomposition◦isagroupwhichisnotcommutative; forinstance, ( ) ( 

1 2 3 

2 1 3 ◦ 1 2 3 

( 3 1 2) 

= 

1 2 3 

) ( 

1 3 2 but 1 2 3 

) ( 

3 1 2 ◦ 1 2 3 

) ( 

2 1 3 = 1 2 3 

3 2 1) 

The set of positive integers N with addition is not group because there 

are no inverses; it is a commutative monoid. N−{0} is a commutative semigroup. (Z,×) is not a group because 

there are no inverses; it is a commutative monoid. 

□ 

A ring is a structure (S,+,×) such that 

(i) (S,+) is abelian group (additive identity is denoted 0 and additive inverse of a is denoted −a) 

(ii) (S,×) is a semigroup 

(iii) it has distributivity: for all a,b,c ∈ S, a×(b+c) = (a×b)+(a×c) and (b+c)×a = (b×a)+(c×a) 

A field is a structure (S,+,×) such that 

(i) (S,+) is abelian group 

(ii) (S −{0},×) is abelian group (multiplicative identity is 1 and multiplicative inverse of a is a −1 ) 

(iii) it has distributivity 

Example 6.3. (Z,+,×) is a ring but not a field. (Z n ,+,×) is a ring but, in general, not a field because only 

elements coprime with n are invertible. If p is prime, then (Z p ,+,×) is a field. Also (Q,+,×) and (R,+,×) 

are fields, but we shall be interested in finite fields only. 

□ 

6.2 Modular arithmetic 

Recall that a is congruent to b modulo n, denoted a ≡ b (mod n) iff n | a−b. The remainder of a modulo n is 

denoted a mod n. Here are some properties of congruences: 

(i) a ≡ a (mod n) 

(ii) if a ≡ b (mod n), then b ≡ a (mod n) 

(iii) if a ≡ b (mod n) and b ≡ c (mod n), then a ≡ c (mod n) 

(iv) if a ≡ b (mod n) and c ≡ d (mod n), then a±c ≡ b±d (mod n) 

(v) if a ≡ b (mod n) and d | n, then a ≡ b (mod d) 

(vi) if a ≡ b (mod n) and a ≡ b (mod m) with gcd(n,m) = 1, then a ≡ b (mod nm) 

The set of residue classes modulo n is denoted Z n and (Z n ,+,×) is a commutative ring. If p is prime, then 

(Z p ,+,×) is a field. 

The greatest common divisor of a and b is the largest common divisor of a and b. It is computed by the 

Euclidean algorithm. 

Euclidean Algorithm 

- given: two positive integers r 0 and r 1 with r 0 > r 1 

- computes: gcd(r 0 ,r 1 ) 

Algorithm:


1. perform the following sequence of divisions 

2. return gcd(r 0 ,r 1 ) = r m 

r 0 = q 1 r 1 +r 2 , 0 < r 2 < r 1 

r 1 = q 2 r 2 +r 3 , 0 < r 3 < r 2 

. 

. 

= q m−1 r m−1 +r m , 0 < r m < r m−1 

= q m r m 

r m−2 

r m−1 

Also, as we know, b ∈ Z n is invertible iff gcd(b,n) = 1. In such a case, the inverse b −1 of a modulo n is 

computed by the Extended Euclidean algorithm. The set of invertible elements is denoted 

Z ∗ n 

Z ∗ n = {b ∈ Z n | gcd(b,n) = 1} 

is an abelian group under multiplication. 

Put: 

t 0 = 0 

t 1 = 1 

t j = (t j−2 −q j−1 t j−1 ) mod r 0 , if j ≥ 2 

Theorem 6.4. If gcd(r 0 ,r 1 ) = 1, then t m = r −1 

1 mod r 0 . 

Proof. For any 1 ≤ j ≤ m, we have r j ≡ t j r 1 (mod r 0 ). Since r m = gcd(r 0 ,r 1 ) = 1, we get 1 ≡ t m r 1 

(mod r 0 ), as claimed. 

□ 

Extended Euclidean Algorithm 

- given: two positive integers n and b 

- computes: the inverse of b modulo n, b −1 mod n, if it exists 

Algorithm: 

1. n 0 = n 

2. b 0 = b 

3. t 0 = 0 

4. t = 1 

5. q = ⌊ n0 

b 0 

⌋ 

6. r = n 0 −qb 0 

7. while r > 0 do 

8. temp = t 0 −qt 

9. if temp ≥ 0 then temp = temp mod n 

10. else temp = n−((−temp) mod n) 

11. t 0 = t 

12. t = temp 

13. n 0 = b 0 

14. b 0 = r 

15. q = ⌊ n0 

b 0 

⌋ 

16. r = n 0 −qb 0 

17. if b 0 ≠ 1 then output b has no inverse modulo n 

18. else return b −1 = t mod n 

Note: Steps 9 and 10 – in some programming languages modular reductions yield negative results


Example 6.5. Let us compute 28 −1 mod 75. We have the computations below 

i r i q i t i 

0 75 0 

1 28 2 1 

2 19 1 −2 

3 9 2 3 

4 1 9 −8 

Therefore, 28 −1 mod 75 = (−8) mod 75 = 67. 

□ 

6.3 Polynomial rings 

Given a commutative ring (R,+,·), consider the set of polynomials in the indeterminate x 

R[x] = {a n x n +···+a 1 x+a 0 | n ≥ 0,a i ∈ R}. 

Addition and multiplication in R[x] is defined using the operations in R: 

n∑ m∑ 

a i x i + b i x i = 

i=0 

i=0 

i=0 

i=0 

Notice that, in general, we have: 

- in a ring, long division: a = qb+r, 

- in a field, exact division: a = qb, where q = ab −1 . 

max(n,m) 

∑ 

i=0 

i=0 

(a i +b i )x i , 

n∑ m∑ 

n+m 

∑ 

( a i x i )·( b i x i ) = ( ∑ 

a j b k )x i . 

j+k=i 

Example 6.6. In Z, 5/3 is 5 = 1×3+2. In Z 7 , 5/3 = 5×3 −1 = 5×5 = 4. 

Therefore, if we want division in a polynomial ring, we need that the coefficients form a field. Otherwise, 

even long division might not be possible. 

□ 

Example 6.7. In Z[x], 5x2 

3x is not possible. In Z 7[x], 5x2 

3x = 4x. 

We shall therefore consider polynomial rings of the form Z p [x] with p prime. 

□ 

6.4 The ring Z p [x] 

For f(x),g(x) ∈ Z p [x], we say that f(x) divides g(x), denoted f(x) | g(x) iff there is q(x) ∈ Z p [x] such that 

f(x)q(x) = g(x). The degree of f(x), denoted deg(f), is the highest exponent on x in f(x). We say that g(x) 

and h(x) are congruent modulo f(x) iff f(x) | g(x)−h(x). 

Also, long division is possible here. There exist unique q(x) and r(x) such that g(x) = q(x)f(x)+r(x) and 

deg(r) < deg(f). Therefore, g(x) is congruent modulo f(x) to a unique polynomial of degree strictly less than 

f(x). 

Example 6.8. Fig 4.4 shows some examples of operations in Z 2 [x]. 

A polynomial f(x) is called irreducible iff there are no polynomials f 1 (x) and f 2 (x) both of non-zero degree 

such that f(x) = f 1 (x)f 2 (x). 

Z is a ring which is not a field. Using a prime p we can build Z p which is a field. Similarly, Z p [x] is not a 

field but we can construct one using an irreducible polynomial f(x) and the set of all residue classes modulo 

f(x), denoted Z p [x]/f(x). The operations in Z p [x]/f(x) are as in Z p [x] but followed by a reduction modulo 

f(x). 

We also notice that both Euclidean algorithm and extended Euclidean algorithm work in Z p [x]/f(x) unchanged. 

□


Example 6.9. It can be verified that x 8 +x 4 +x 3 +x+1 is irreducible. Let us compute in gcd(x 7 +x+1,x 8 + 

x 4 +x 3 +x+1) and (x 7 +x+1) −1 mod (x 8 +x 4 +x 3 +x+1). The computations are shown in the table below 

i r i q i t i 

0 x 8 +x 4 +x 3 +x+1 0 

1 x 7 +x+1 x 1 

2 x 4 +x 3 +x 2 +1 x 3 +x 2 +1 x 

3 x x 3 +x 2 +x x 4 +x 3 +x+1 

4 1 x x 7 

Thus, gcd(x 7 +x+1,x 8 +x 4 +x 3 +x+1) = 1 and (x 7 +x+1) −1 mod (x 8 +x 4 +x 3 +x+1) = x 7 . 

□ 

6.5 Finite fields 

If can be shown that the number of elements in any finite field is a power of a prime, that is, p n , p prime, n ≥ 1. 

The finite field with p n elements is denoted F p n or GF(p n ). For n = 1, F p is isomorphic to Z p . For n ≥ 2, F p n 

is isomorphic to Z p [x]/f(x), where f(x) is an irreducible polynomial of degree n. (It has p n elements because 

there are n coeficients which can take p values.) For any irreducible polynomial f(x) an isomorphic field is 

obtained. 

Example 6.10. A field with 8 = 2 3 elements can be constructed using Z 2 [x] and the irreducible polynomial 

x 3 +x+1, that is Z 2 [x]/(x 3 +x+1). 

□ 

6.6 Motivation for using finite fields 

All encryption algorithms use arithmetic. So, if we need division, then we have to work in a field (see the above 

examples). Second, for convenience and implementations issues, we work with integers that fit into a number 

of bits, that is, we work with numbers between 0 and 2 n −1. 

Assume we have 8-bit integers. We can represent numbers from 0 to 255. Since 256 is not a prime, we can 

try the nearest smaller prime, which is 251. That means to use the field Z 251 . First, we have inefficient use of


memory. Second, the fact that some numbers cannot appear (251 to 255) represents additional information for 

potential attacks. 

Assuming we do not use division in the encryption/decryption algorithms, we can try to use Z 2 n which is 

not a field. For n = 3, the multiplication table for Z 2 3 is shown below: 

× Z8 0 1 2 3 4 5 6 7 

0 0 0 0 0 0 0 0 0 

1 0 1 2 3 4 5 6 7 

2 0 2 4 6 0 2 4 6 

3 0 3 6 1 4 7 2 5 

4 0 4 0 4 0 4 0 4 

5 0 5 2 7 4 1 6 3 

6 0 6 4 2 0 6 4 2 

7 0 7 6 5 4 3 2 1 

On the other hand, the multiplication table for F 2 3, represented as Z 2 [x]/(x 3 + x + 1) is given below (each 

polynomial is represented as a number from 0 to 7 whose binary representation gives the coefficients): 

× F2 3 

0 1 2 3 4 5 6 7 

0 0 0 0 0 0 0 0 0 

1 0 1 2 3 4 5 6 7 

2 0 2 4 6 3 1 7 5 

3 0 3 6 5 7 4 1 2 

4 0 4 3 7 6 2 5 1 

5 0 5 1 4 2 7 3 6 

6 0 6 7 1 5 3 2 4 

7 0 7 5 2 1 6 4 3 

The distribution of numbers in the two tables is given below: 

integer 1 2 3 4 5 6 7 

occurrences for Z 8 4 8 4 12 4 8 4 

occurrences for F 2 3 7 7 7 7 7 7 7 

We can see a very uniform distribution for F 2 3 and very non-uniform for Z 8 . Such a distribution is very 

important for the security of a cryptosystem. 

Consequently, fields of the form F 2 n are attractive for cryptographic algorithms. 

6.7 Computational considerations in F 2 n 

Addition in F 2 n is simply bitwise xor since this is the addition of Z 2 . 

Multiplication is slightly more complicated. We show how it can be done efficiently in F 2 8 represented as 

Z 2 [x]/m(x), with m(x) = x 8 +x 4 +x 3 +x+1. (This is used in AES.) We notice that for f(x) = b 7 x 7 +b 6 x 6 + 

b 5 x 5 +b 4 x 4 +b 3 x 3 +b 2 x 2 +b 1 x+b 0 , we have 

xf(x) mod m(x) = (b 7 x 8 +b 6 x 7 +b 5 x 6 +b 4 x 5 +b 3 x 4 +b 2 x 3 +b 1 x 2 +b 0 x) mod m(x) 

= b 6 x 7 +b 5 x 6 +b 4 x 5 +b 3 x 4 +b 2 x 3 +b 1 x 2 +b 0 x+b 7 (x 4 +x 3 +x+1) 

Let us denote polynomial in F 2 3 as 8-bit blocks. Then 

{ 

(b 6 b 5 b 4 b 3 b 2 b 1 b 0 0), if b 7 = 0, 

xf(x) = 

(b 6 b 5 b 4 b 3 b 2 b 1 b 0 0)⊕(00011011), if b 7 = 1. 

Therefore, multiplication will be done in two stages: 

- compute the multiplication with powers of x by repeating the above 

- xor the corresponding results 

The idea generalizes immediately to any F 2 n.


Example 6.11. We compute f(x)g(x) mod m(x) for f(x) = x 6 +x 4 +x 2 +x+1 and g(x) = x 7 +x+1. First, 

the powers of x: 

(01010111)(00000001) = (01010111) 

(01010111)(00000010) = (10101110) 

(01010111)(00000100) = (01011100)⊕(00011011)= (01000111) 

(01010111)(00001000) = (10001110) 

(01010111)(00010000) = (00011100)⊕(00011011)= (00000111) 

(01010111)(00100000) = (00001110) 

(01010111)(01000000) = (00011100) 

(01010111)(10000000) = (00111000) 

Next, we xor the results corresponding to 1, x, and x 7 . We get 

f(x)g(x) mod m(x) = (01010111)⊕(10101110)⊕(00111000)= (11000001)= x 7 +x 6 +1. 

□


7 ADVANCED ENCRYPTION STANDARD 

7.1 The new standard 

The underlying algorithm, Rinjdael (by J. Daemen and V. Rijmen) was chosen by NIST as the new standard 

(to replace DES) in Oct 2000 out of 21 candidate algorithms. The initial criteria used by NIST were: 

- security – effort required to cryptanalyze the algorithm 

- cost – computational efficiency 

- algorithm and implementation characteristics – flexibility, simplicity, etc. 

These criteria reduced the candidates to 5. The second round of criteria contained: 

- general security – analysis by the cryptographic community 

- software implementations – variety of platforms and variation of speed with key size 

- restricted space environments – e.g., smart cards 

- hardware implementations 

- attacks on implementations – timing attacks and power analysis 

- encryption vs decryption – different alg or the same, timing differences 

- key agility – ability to change keys quickly and with little effort 

- other versatility and flexibility – support for other key sizes, block sizes, number of rounds 

- potential for parallelism 

7.2 Description of AES 

The overall structure of AES is shown in Fig. 5.1.


The possible parameters of AES are shown in the table below 

Key size (words/bytes/bits) 4/16/128 6/24/192 8/32/256 

Plaintext block size (words/bytes/bits) 4/16/128 4/16/128 4/16/128 

Number of rounds 10 12 14 

Round key size (words/bytes/bits) 4/16/128 4/16/128 4/16/128 

Expanded key size (words/bytes/bits) 44/176/1408 52/208/1664 60/240/1920 

Here are some of the main characteristics of AES: 

- input to encryption and decryption algorithms is a 128-bit block 

- the block is represented as a matrix of 16 bytes, ordered by columns 

- the block is copied to the state array which, at the end is copied into output matrix – see Fig 5.2(a) 

- the key is expanded into an array of 44 key schedule words – see Fig. 5.2(b) 

There are four stages in each round, except for the last. A single (complete) round is shown in Fig 5.3.


Before discussing the operations in a round in detail, we make some more comments on the overall structure 

of AES: 

- it is not a Feistel structure – it allows parallelism 

- the expanded key has 44 32-bit words and each round uses 4 words (128 bits) 

- each stage is easily reversible 

- the encryption and decryption algorithms are not the same 

- there are four stages in each round: Substitute Bytes, Shift Rows, Mix columns, and Add round key; the 

first three provide confusion, difussion and nonlinearity; security is provided by the xor with the round key 

We discussnext eachofthe four stages. AESuses arithmeticin the finite field F 2 8 representedasZ 2 [x]/m(x), 

for m(x) = x 8 +x 4 +x 3 +x+1. 

Substitute bytes 

This is a simple table lookup; see Fig 5.4(a). An AES S-box is a matrix of 16 by 16 bytes values. Each byte 

of state is mapped to a new value by taking the value in the S-box in the line given by the first four bits and 

the column given by the last four bits. 

The S-box itself is constructed as follows: 

- it is initialized with all values for bytes in increasing order following the row order 

- each byte is mapped to its inverse in F 2 8 

- each byte (b 7 b 6 b 5 b 4 b 3 b 2 b 1 b 0 ) is modified according to the transformation 

⎡ 

⎢ 

⎣ 

⎤ 

b 0 

b 1 

b 2 

b 3 

b 4 

b 5 

⎥ 

b 6 

⎦ 

b 7 

⎡ 

← 

⎢ 

⎣ 

1 0 0 0 1 1 1 1 

1 1 0 0 0 1 1 1 

1 1 1 0 0 0 1 1 

1 1 1 1 0 0 0 1 

1 1 1 1 1 0 0 0 

0 1 1 1 1 1 0 0 

0 0 1 1 1 1 1 0 

0 0 0 1 1 1 1 1 

⎤⎡ 

⎥⎢ 

⎦⎣ 

⎤ 

b 0 

b 1 

b 2 

b 3 

b 4 

b 5 

⎥ 

b 6 

⎦ 

b 7 

The S-box is designed to resist known attacks. There is low correlation between input and output bits. The 

output cannot be described as a simple mathematical function of the input. The S-box has no fix point or 

opposite fixed point. It is invertible but not its own inverse. 

⎡ 

⊕ 

⎢ 

⎣ 

1 

1 

0 

0 

0 

1 

1 

0 

⎤ 

⎥ 

⎦


Below are the S-box and its inverse. 

S-box 

0 1 2 3 4 5 6 7 8 9 a b c d e f 

0 63 7c 77 7b f2 6b 6f c5 30 01 67 2b fe d7 ab 76 

1 ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0 

2 b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15 

3 04 c7 23 c3 18 96 05 9a 07 12 80 e2 eb 27 b2 75 

4 09 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84 

5 53 d1 00 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf 

6 d0 ef aa fb 43 4d 33 85 45 f9 02 7f 50 3c 9f a8 

7 51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2 

8 cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73 

9 60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db 

a e0 32 3a 0a 49 06 24 5c c2 d3 ac 62 91 95 e4 79 

b e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 08 

c ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a 

d 70 3e b5 66 48 03 f6 0e 61 35 57 b9 86 c1 1d 9e 

e e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df 

f 8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16 

inverse S-box 

0 1 2 3 4 5 6 7 8 9 a b c d e f 

0 52 09 6a d5 30 36 a5 38 bf 40 a3 9e 81 f3 d7 fb 

1 7c e3 39 82 9b 2f ff 87 34 8e 43 44 c4 de e9 cb 

2 54 7b 94 32 a6 c2 23 3d ee 4c 95 0b 42 fa c3 4e 

3 08 2e a1 66 28 d9 24 b2 76 5b a2 49 6d 8b d1 25 

4 72 f8 f6 64 86 68 98 16 d4 a4 5c cc 5d 65 b6 92 

5 6c 70 48 50 fd ed b9 da 5e 15 46 57 a7 8d 9d 84 

6 90 d8 ab 00 8c bc d3 0a f7 e4 58 05 b8 b3 45 06 

7 d0 2c 1e 8f ca 3f 0f 02 c1 af bd 03 01 13 8a 6b 

8 3a 91 11 41 4f 67 dc ea 97 f2 cf ce f0 b4 e6 73 

9 96 ac 74 22 e7 ad 35 85 e2 f9 37 e8 1c 75 df 6e 

a 47 f1 1a 71 1d 29 c5 89 6f b7 62 0e aa 18 be 1b 

b fc 56 3e 4b c6 d2 79 20 9a db c0 fe 78 cd 5a f4 

c 1f dd a8 33 88 07 c7 31 b1 12 10 59 27 80 ec 5f 

d 60 51 7f a9 19 b5 4a 0d 2d e5 7a 9f 93 c9 9c ef 

e a0 e0 3b 4d ae 2a f5 b0 c8 eb bb 3c 83 53 99 61 

f 17 2b 04 7e ba 77 d6 26 e1 69 14 63 55 21 0c 7d 

Here is an example of calculation for one position in the S-box. For position 01, we have ({01}) −1 = {01} = 

(00000001) and after transformation it becomes (01111100)= {7c}. 

Here is another one. We have {95} −1 = {8a} = (10001010). After transformation it becomes (00101010) = 

{2a}. Here is an example of SubBytes transformation: 

ea 04 65 85 

83 45 5d 96 

5c 45 5d 96 

f0 2d ad c5 

→ 

87 f2 4d 97 

ec 6e 4c 90 

4a c3 46 e7 

8c d8 95 a6


Shift row 

It is shown in Fig 5.5(a). The idea is to mix the columns of state such that the new state contains in each 

column bytes from all previous columns. Here is an example of Shift Row transformation: 

87 f2 4d 97 

ec 6e 4c 90 

4a c3 46 e7 

8c d8 95 a6 

→ 

87 f2 4d 97 

6e 4c 90 ec 

46 e7 4a c3 

a6 8c d8 95 

Mix column 

It is defined by the transformation 

⎡ 

⎢ 

⎣ 

⎤ ⎡ 

s 00 s 01 s 02 s 03 

s 10 s 11 s 12 s 13 

⎥ 

s 20 s 21 s 22 s 23 

⎦ ← ⎢ 

⎣ 

s 30 s 31 s 32 s 33 

02 03 01 01 

01 02 03 01 

01 01 02 03 

03 01 01 02 

⎤⎡ 

⎥⎢ 

⎦⎣ 

⎤ 

s 00 s 01 s 02 s 03 

s 10 s 11 s 12 s 13 

⎥ 

s 20 s 21 s 22 s 23 

⎦ 

s 30 s 31 s 32 s 33 

The idea is to ensure good mixing among the bytes of each column. In fact the above transformation is done 

independently on columns (as seen in Fig. 5.3) and is equivalent to the following (done for each column i = 0..3): 

⎡ ⎤ ⎡ ⎤⎡ 

⎤ 

s 0i 02 03 01 01 s 0i 

⎢ s 1i 

⎥ 

⎣ s 2i 

⎦ ← ⎢ 01 02 03 01 

⎥⎢ 

s 1i 

⎥ 

⎣ 01 01 02 03 ⎦⎣ 

s 2i 

⎦ 

s 3i 03 01 01 02 s 3i 

One criterion in constructing the Mix column transformation this way was to maximize the number of active 

(non-zero) bytes in input and output together. Also, any linear relation between bytes of input and output 

involves at least 5 different bytes. The coefficients in the matrix aboveare chosen as small as possible to improve 

speed on 8-bit processors. Notice that the inverse mix column transformation uses the matrix 

⎡ 

⎢ 

⎣ 

0e 0b 0d 09 

09 0e 0b 0d 

0d 09 0e 0b 

0b 0d 09 0e 

whose coefficients are larger and so more expensive to implement. However, encryption is more important than 

decryption because: 

- in the CFB and OFB modes only encryption is used, 

- AES can be used (like any block cipher) for message authentication codes, where also only encryption is 

used. 

Add round key 

This is simply a xor with the current round key; see Fig. 5.4(b). The operation is viewed as a column wise 

operation between the 4 bytes of a state column and one word of the round key. It can be viewed also as a 

byte-level operation. 

⎤ 

⎥ 

⎦


Key expansion 

The key expansion algorithm is given below 

KeyExpansion Algorithm 

- given: the key key[16] with 16 bytes 

- computes: the expanded key word[44] with 44 words 

Algorithm 

1. for i from 0 to 3 do 

2. w[i] = (key[4i],key[4i+1],key[4i+2],key[4i+3]) 

3. for i from 4 to 43 do 

4. temp = w[i−1] 

5. if i mod 4 = 0 then 

6. temp = SubWord(RotWord(temp))⊕Rcon[i/4] 

7. w[i] = w[i−4]⊕temp 

Some more details: 

- the key is copied in the first four words of the expanded key 

- the remainder of the expanded key is filled in four words at a time 

- each word w[i] depends on w[i−1] and w[i−4] 

- in three cases, a simple xor is performed 

- when i is a multiple of 4, a more complex function g is used: 

- RotWord is a one-byte circular left shift 

- SubWord is a byte substitution using the S-box 

- the result is then xored with a round constant Rcon[j] = (RC[j],0,0,0) where RC[1] = 1 and RC[i] = 

x RC[i−1] = x i−1 ; that is, 

j 1 2 3 4 5 6 7 8 9 10 

RC[j] 01 02 04 08 10 20 40 80 1b 36


Here is an example of application of function g. If the round key for round 8 is 

ea d2 73 21 b5 8d ba d2 31 2b f5 60 7f 8d 29 2f 

then the first 4 bytes of the round key for round 9 are computed below 

i (decimal) temp RotWord SubWord Rcon[9] xor with Rcon w[i−4] w[i] = temp⊕w[i−4] 

36 7f8d292f 8d292f7f 5da515d2 1b000000 46a515d2 ead27321 ac7766f3 

The expansion key algorithm is design to resist to known attacks. The round-dependent round constant implies 

that the round key is differently generated in different rounds. Therefore, knowledge of part of cipher key or 

round key does not enable computing many other round keys. 

7.3 Decryption 

As seen above, the decryption algorithm is different from the encryption algorithm. We show here a decryption 

algorithm which has the same structure as the encryption algorithm. It is shown in Fig. 5.7. 

Two observations are needed to make it clear that the algorithm works as intended. First, Substitute Byte 

and ShiftRow are inversed and then interchanged. This is possible because 

InvShiftRow(InvSubBytes(s i )) = InvSubBytes(InvShiftRows(s i )) 

Second, when interchanging the inverses of AddRoundKey and InvMixColumns, we have to use 

InvMixColumns(s i ⊕w j ) = (InvMixColumns(s i ))⊕(InvMixColumns(w j )). 

This is true by the distributivity of ⊕. Notice that we have now the operation InvMixColumns twice; on 

state and on the round key.


8 MORE NUMBER THEORY 

...both Gauss and lesser mathematicians may be justified in rejoicing that there is one science 

[number theory] at any rate, and that their own, whose very remoteness from ordinary human 

activities should keep it gentle and clean. 

G. H. Hardy 

A Mathematician’s Apology, 1940 

G. H. Hardy would have been surprised and probably displeased with the increasing interest in 

number theory for applications to “ordinary human activities” such as information transmission and 

cryptography. 

8.1 Complexity of arithmetic operations 

Neal Koblitz 

A Course in Number Theory and Cryptography, 1994 

- big-O notation 

- upper bound on the complexity (running time) of an algorithm in which constant factors are suppressed 

- formally, if f,g : Z → R, then f(n) = O(g(n)) iff there are c > 0 and n 0 ∈ Z such that 0 ≤ f(n) ≤ cg(n) 

for all n ≥ n 0 

- example: 2n 2 +100n−4000 = O(n 2 ) 

- representations of integers 

- n in base 2 has ⌊log 2 n⌋+1 ≈ log 2 n bits 

- n in base b has ⌊log b n⌋+1 ≈ log b n digits 

- this is the size of the input 

- arithmetic operations 

- assume m is a k-bit integer and n is a l-bit integer with k ≤ l; 

- addition – m+n can be done in time O(l) 

- subtraction – m−n can be done in time O(l) 

- multiplication – m×n can be done in time O(lk) 

- long division – m/n (n = qm+r, q > 0,0 ≤ r ≤ m−1) can be done in time O(k(l−k)) which is O(kl) 

- modular arithmetic operations 

- assume n is a l-bit integer and 0 ≤ m 1 ,m 2 ≤ n−1 

- modular addition – (m 1 +m 2 ) mod n can be done in time O(l) 

- modular subtraction – (m 1 −m 2 ) mod n can be done in time O(l) 

- modular multiplication – (m 1 m 2 ) mod n can be done in time O(l 2 ) 

- greatest common divisor 

- computed by the Euclidean algorithm 

- complexity: number of iterations is O(logr 0 ) so, total time is O(log 3 r 0 ) (proof idea: for any i, we have 

2r i+2 < r i ) 

- multiplicative inverses 

- computed by the Extended Euclidean algorithm 

- complexity: O(log 3 n) 

8.2 The Chinese remainder theorem 

- a method for solving systems of congruences


Theorem 8.1 (Chinese Remainder Theorem). If m 1 ,...,m r are pairwise relatively prime positive integers and 

a 1 ,...,a r are integers, then the system 

⎧ 

x ≡ a 1 (mod m 1 ) 

⎪⎨ x ≡ a 2 (mod m 2 ) 

⎪⎩ 

. 

x ≡ a r (mod m r ) 

has a unique solution modulo M = m 1 m 2···m r , given by 

x = 

r∑ 

a i M i y i mod M, 

i=1 

where M i = M m i 

and y i = M −1 

i mod m i ,1 ≤ i ≤ r. 

Proof. Assume x as given. For any 1 ≤ i,j ≤ r,i ≠ j, we have m i | M j and so a i M i y i ≡ 0 (mod m j ). But 

a j M j y j ≡ a j (mod m j ) by the definition of y j . Thus, x is a solution. 

The uniqueness modulo M follows from the fact that m i ’s are relatively primes. Indeed, if there are two 

solutions x and x ′ , then x and x ′ must be congruent modulo M because of the property 4 of congruences (see 

section 2.4). (Notice that the uniqueness follows also by a counting argument.) 

□ 

Complexity (for computing a solution): O(rlog 3 M) 

Example 8.2. Consider the system 

⎧ 

⎨ 

⎩ 

x ≡ 5 (mod 7) 

x ≡ 3 (mod 11) 

x ≡ 10 (mod 13) 

We have here: a 1 = 5, a 2 = 3, a 3 = 10 and m 1 = 7, m 2 = 11, m 3 = 13. We compute M = 1001, M 1 = 143, 

M 2 = 91, M 3 = 77 and then y 1 = 5, y 2 = 4, y 3 = 12. The solution will be x = 13907 mod 1001 = 894. □ 

Remark 8.3. Consider the function χ : Z M → Z m1 × ··· × Z mr , defined by χ(x) = (x mod m 1 ,··· ,x 

mod m r ). The Chinese Remainder Theorem is equivalent to proving that χ is a bijection. In particular, this 

means we can represent numbers in Z M (which can be very large in practice) as tuples of their remainders 

modulo m i ,1 ≤ i ≤ r, (which are much smaller). This is called modular representation. 

Example 8.4. This example shows the above bijection C. Consider r = 2, m 1 = 5, m 2 = 3. We have then 

M = 15 and the values of C are: 

χ(0) = (0,0) 

χ(3) = (3,0) 

χ(6) = (1,0) 

χ(9) = (4,0) 

χ(12) = (2,0) 

χ(1) = (1,1) 

χ(4) = (4,1) 

χ(7) = (2,1) 

χ(10) = (0,1) 

χ(13) = (3,1) 

χ(2) = (2,2) 

χ(5) = (0,2) 

χ(8) = (3,2) 

χ(11) = (1,2) 

χ(14) = (4,2) 

Example 8.5. This example shows how large numbers can be manipulated using their modular representation 

as above. Consider r = 2, m 1 = 37, m 2 = 49. We have then M = 1813. The representations of the numbers 

973 and 678 are 

χ(678) = (678 mod 37,678 mod 49) = (12,41), 

χ(973) = (973 mod 37,973 mod 49) = (11,42). 

If we want to add or multiply then we work on each position in the tuples: 

χ(678+973) = (12+11 mod 37,41+42 mod 49) = (23,34), 

χ(678×973) = (12×11 mod 37,41×42 mod 49) = (14,32). 

□ 

□


8.3 The theorems of Fermat and Euler 

Theorem 8.6 (Fermat’s Little Theorem). If p is a prime, then, for any integer a such that p ∤ a, we have 

a p−1 ≡ 1 (mod p). 

Proof. We first prove that 

{0a mod p,1a mod p,...,(p−1)a mod p} = {0,1,...,p−1}. 

Indeed, if ia ≡ ja (mod p), then p | (i−j)a hence i = j. 

Therefore, (p−1)!a p−1 ≡ (p−1)! (mod p). Since (p−1)! is not divisible by p, we have that p | (a p−1 −1), 

as claimed. 

□ 

Corollary 8.7. If p is a prime and a is an integer, then a p ≡ a (mod p). 

Euler’s theorem is a generalization. Fermat’s is obtained for m prime. We shall need a lemma. 

Lemma 8.8. If gcd(m,n) = 1, then φ(mn) = φ(m)φ(n). 

Proof. The Chinese Remainder Theorem shows that there is a 1-to-1 correspondence between the numbers 

i,0 ≤ i ≤ mn − 1 which are relatively prime with mn and the pairs (i 1 ,i 2 ) such that 0 ≤ i 1 ≤ m − 1, 

0 ≤ i 2 ≤ n−1, and i 1 is relatively prime with m, i 2 is relatively prime with n. □ 

Note: Using Lemma 8.8 we can prove the formula for Euler’s function (Theorem 2.5). 

Theorem 8.9 (Euler’s Theorem). If gcd(a,m) = 1, then a φ(m) ≡ 1 (mod m). 

Proof. The case of prime powers m = p k , p prime, k ≥ 1. Induction on k. k = 1 is Fermat’s Little Theorem. 

Assume it for k −1 and prove it for k. We have a φ(pk−1) ≡ 1 (mod p k−1 ) and so a pk−1 −p k−2 = 1+p k−1 b, for 

some integer b. Then, raising at power p, we get a pk −p k−1 = 1+p k c, for some integer c. 

For arbitrary m = p k1 

1 pk2 2 ...pkr r , we use the result for prime powers and property 4 of congruences (see 

section 2.4). 

□ 

Note: Euler’s theorem can also be proved the same way we proved Fermat’s theorem. Consider the elements 

which are smaller then m and relatively prime with m, say x 1 ,x 2 ,...,x φ(m) . Then {ax i mod m | 1 ≤ i ≤ 

φ(m)} = {x i | 1 ≤ i ≤ φ(m)} and the reasoning continues similarly. 

8.4 Cyclic groups and primitive elements 

Theorem 8.10 (Langrange’s Theorem). If G is a finite group and H is a subgroup of G, then |H| | |G|. 

Proof. A coset of H is xH for x ∈ G. It is easy to see that two cosets are either identical or disjoint. Since 

the cardinality of any coset is |H|, we get that G is a disjoint union of |H|-element sets. The claim follows. □ 

Note: Because Z ∗ n is a multiplicative group of order φ(n), Lagrange’s theorem implies Euler’s theorem. 

If G is a multiplicative group and g ∈ G, then the order of g is the smallest m such that g m = 1; it is 

denoted ord(g). We have that 〈g〉 = {g i | 0 ≤ i ≤ ord(g)−1} is a subgroup of G. 

Corollary 8.11. If G is a multiplicative group of order n and g ∈ G, then ord(g) | n. 

A - cyclic group is a group G having an element of order |G|; such an element is called a generator or 

primitive element of G. When G = Z p , it is also called primitive root. 

Lemma 8.12. If α ∈ Z ∗ n and i ≥ 1, then ord(αi ) = 

ord(α) 

gcd(ord(α),i) . 

Proof. The order of α i is the smallest positive k such that ik is a multiple of ord(α). That is, ik is both a 

multiple of i and ord(α), so it should be that ik = lcm(ord(α),i). We get k = 

□ 

ord(α) 

gcd(ord(α),i) . 

Theorem 8.13. If p is prime, then Z ∗ p is a cyclic group. The number of primitive elements modulo p is φ(p−1).


Proof (sketch). Assume a is an element of order d of Z ∗ p. Then d | p−1. Also, all element a,a 2 ,...,a d = 1 

are distinct and are all of the roots of the equation x d = 1. Therefore, all elements of order d are powers of a. 

Also, by the previous lemma, a power a j has order d iff gcd(d,j) = 1. Thus, if there is an element of order d, 

then there are exactly φ(d) elements of order d. 

Every element has some order which divides p−1. Since ∑ d|p−1 φ(d) = p−1 = |Z∗ p |, it must be that there 

are always φ(d) elements of order d (and never 0). 

In particular, there are φ(p−1) elements of order p−1. 

□ 

Example 8.14. For p = 13, there should be φ(13−1) = 4 primitive elements modulo 13. Let us compute all 

powers of 2 modulo 13: 

2 0 mod 13 = 1 

2 1 mod 13 = 2 

2 2 mod 13 = 4 

2 3 mod 13 = 8 

2 4 mod 13 = 3 

2 5 mod 13 = 6 

2 6 mod 13 = 12 

2 7 mod 13 = 11 

2 8 mod 13 = 9 

2 9 mod 13 = 5 

2 10 mod 13 = 10 

2 11 mod 13 = 7 

We can see that 2 is a primitive element modulo 13. Also, 2 i is primitive if and only if gcd(1,12) = 1; that 

happens for i = 1,5,7,11. Therefore the primitive elements modulo 13 are 2,6,7,11. 

□ 

Example 8.15. Let us compute all powers of all elements of Z ∗ 19. 

x x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 x 11 x 12 x 13 x 14 x 15 x 16 x 17 x 18 

1 

2 4 8 16 13 7 14 9 18 17 15 11 3 6 12 5 10 1 

3 9 8 5 15 7 2 6 18 16 10 11 14 4 12 17 13 1 

4 16 7 9 17 11 6 5 1 

5 6 11 17 9 7 16 4 1 

6 17 7 4 5 11 9 16 1 

7 11 1 

8 7 18 11 12 1 

9 5 7 6 16 11 4 17 1 

10 5 12 6 3 11 15 17 18 9 14 7 13 16 8 4 2 1 

11 7 1 

12 11 18 7 8 1 

13 17 12 4 14 11 10 16 18 6 2 7 15 5 8 9 3 1 

14 6 8 17 10 7 3 4 18 5 13 11 2 9 12 16 15 1 

15 16 12 9 2 11 13 5 18 4 3 7 10 17 8 6 14 1 

16 9 11 5 4 7 17 6 1 

17 4 11 16 6 7 5 9 1 

18 1 

We have ord(4) = 9 and ord(4 5 ) = 

9 

gcd(9,5) = 9, ord(43 )(= ord(7)) = 

9 

gcd(9,3) = 3. 

Also, there should be φ(18) = 6 primitive elements; those are 2,3,10,13,14,15. 

It might take very long to verify all powers of a number to check whether it is primitive or not. Here is a 

better way. 

Theorem 8.16. Let p be a prime and α ∈ Z ∗ p. Then α is primitive iff α (p−1)/q ≢ 1 (mod p) for all primes 

q | (p−1). 

Proof. If α is primitive, then α i ≢ 1, for all 1 ≤ i ≤ p−2. 

Conversely, assume α is not primitive and let d be its order. By Lagrange’s theorem, d | (p−1) and, since 

α is not primitive, d 1 and let q be a prime divisor of (p − 1)/d. We have then 

α (p−1)/q ≡ 1 (mod p). 

□ 

□


Example 8.17. For p = 13, in order to see that 2 is primitive modulo 13, we need only to check that 2 6 ≢ 1 

(mod 13) and 2 4 ≢ 1 (mod 13). 

For p = 19, we see that x ∈ Z ∗ 19 is primitive by verifying that x 6 ≢ 1 (mod 19) and x 9 ≢ 1 (mod 19); see 

the above table. 

□ 

8.5 Discrete logarithms 

Given a group (G,·) and an element α ∈ G such that ord(α) = n, we have that 〈α〉 = {α i | 0 ≤ i ≤ n−1} is a 

subgroup of G. Therefore, for each β ∈ 〈α〉, there is a unique a such that α a = β; this is called the logarithm 

of β in base α. 

A particular case of this is G = Z p , p prime, and α a primitive element modulo p. This a is denoted log α (β) 

(logarithm of β in base α modulo p) or ind α,p (β) (the index of β for the base α modulo p). 

Example 8.18. For p = 19 and α = 3, we have that log 3 (5) = 4 and log 3 (12) = 15. 

□ 

Discrete Logarithm Problem (discretelog) 

- given: p a prime, α ∈ Z ∗ p primitive, β ∈ Z∗ p 

- compute: log α β = a,0 ≤ a ≤ p−2 such that α a ≡ β (mod p)


9 PUBLIC-KEY CRYPTOGRAPHY AND RSA 

9.1 The idea of public keys 

In the classicalmodelofcryptographywestudied sofar, Alice and BobsecretlychooseakeyK. Both encryption 

and decryptionalgorithms, e K and d K , arederivedfrom this key. So, they haveto meet priorto communicating, 

which is a major drawback; this is called the key management problem. 

The idea behind public-key cryptography is to find ciphers where it is computationally infeasible to find d K 

from e K . If so, then the encryption key (Bob’s), called public key, can be made public. Thus, anyone can send 

messages to Bob without prior communication and only Bob can decrypt because only he knows the private 

key d K . 

The idea of public-key systems was developed by Diffie and Hellman in 1976. It is depicted in Fig. 9.1(a). 

(Figure 9.1(b) shows a different way of using it to provide authentication.) It is the most important change in 

the history of cryptography. Each party has two keys, one public and one private. Either key can be used for 

encryption and the other one will be used for decryption. The first realization 1 of a public-key system was RSA 

by Rivest, Shamir, and Adleman in 1977. 

1 Diffie and Hellman were the first to make public the ideas behind public-key cryptography and RSA was the first realization 

of these ideas which was made public. The idea of public-key cryptography was claimed to have been discovered first by NSA in 

mid-1960s. The first documented introduction of these concepts happened in 1970 in a classified report by James Ellis from CESG 

(Communication-Electronics Security Group) of the GCHQ (British Government Headquarters). Also included in the report was 

a paper by Clifford Cocks which described a cipher which is essentially the same as RSA.


Notice that unconditional security is impossible here. Oscar, having y, simply tries x’s until he finds the 

unique one with e K (x) = y; K is the public key. Therefore, we study computational security. It should be 

computationally infeasible to determine the private key given the public one. 

The basic tools are one-way functions and trapdoor one-way functions. Notice that there are no provable 

one-way function known. 

Two important comments: 

- public-key encryption is not more secure than symmetric encryption, just different 

-public-keysystemsaremuchslowerthansymmetriconesandthereforethey arenotreplacingthe symmetric 

ones; the public-key ciphers are used for key management and signatures. 

9.2 The RSA cryptosystem 

The RSA Cryptosystem 

P = C = Z n ; n = pq, p,q odd primes 

K = {(n,p,q,a,b) | n = pq,p,q primes ,ab ≡ 1 (mod φ(n))}. 

public: n,b 

private: p,q,a 

encryption: e K (x) = x b mod n 

decryption: d K (y) = y a mod n 

Note: φ(n) = (p−1)(q −1) 

Let us prove the correctness of RSA. Since ab ≡ 1 (mod φ(n)), there is an integer t ≥ 1 such that ab = 

tφ(n)+1. If x ∈ Z ∗ n , then y a ≡ (x b ) a (mod n) 

≡ x tφ(n)+1 (mod n) 

≡ (x φ(n) ) t x (mod n) 

≡ 1 t x (mod n) 

≡ x (mod n) 

If x ∈ Z n −Z ∗ n, then either x = 0 or x is divisible by p or q but not both. x = 0 is clear. Assume p | x. Then 

obviously x ab ≡ x (mod p). Also, as above, x ab ≡ x (mod q). By property 4 of congruences (see section 2.4), 

we are done. 

Example 9.1. Assume Bob chooses p = 101 and q = 113. Then n = 11413 and φ(n) = 11200 = 2 6 5 2 7. An 

integer b can be used as encryption exponent iff b is not divisible by 2, 5, or 7. (In practice, Bob will not factor 

φ(n) but just verify that gcd(b,φ(n)) = 1 and compute b −1 mod φ(n) at the same time.) Assume Bob chooses 

b = 3533. Then the private decryption exponent is a = b −1 mod 11200 = 6597. Bob publishes n = 11413 and 

b = 3533 in a directory. 

Now suppose Alice wants to encrypt the plaintext 9726 and send it to Bob. Then she computes 

9726 3533 mod 11413 = 5761 

and sends it to Bob. Bob receives the ciphertext 5761 and and computes 

that is, the plaintext. 

9.3 RSA security 

5761 6597 mod 11413= 9726, 

The security of RSA is based on two one-way functions: 

- modular exponentiation (difficult problem: discrete logarithm) 

- multiplication of primes (difficult problem: factoring) 

- trapdoor: p and q; Bob can compute φ(n) = (p−1)(q −1) and so the decryption exponent a 

□


9.4 Implementation 

Setting up RSA 

1. Bob generates two large primes p and q 

2. Bob computes n = pq and φ(n) = (p−1)(q −1) 

3. Bob chooses a random b,1 

4. Bob computes a = b −1 mod φ(n) using the extended Euclidean alg. 

5. Bob publishes n and b in a directory as his public key 

Current factorization algorithms are able to factor numbers up to 155 decimal digits, which means 512 bits. 

Therefore, p and q should be primes of approximately 512 bits each such that n will have 1024 bits. 

We have to be able to find reasonably fast large primes. The Prime number theorem says that the 

number of primes smaller than N is approximately N/lnN. Thus, the probability that p randomly chosen 

between 1 and N be a prime is 1/lnN; if p is chosen odd then this probability becomes 2/lnN. For 512 bit 

primes, that means 2/ln2 512 ≈ 2/355. That is, on average,one out of 178 random 512 bit odd integers is prime. 

We shall guess and verify; choose a random number and test whether it is a prime. We shall have probable 

prime but with very high probability. 

We shall need also efficient encrypting and decrypting. That is, we have to be able to do fast modular 

exponentiation (by repeated modular multiplication, we need for x c mod n, c − 1 modular multiplications, 

which is very inefficient if c is large; c can be as large as φ(n)−1 which is exponential). 

9.5 Fast modular exponentiation 

Square-and-multiply algorithm 

- given: n, x, b (b is assumed in base 2, b = ∑ l−1 

i=0 b i2 i ) 

- computes: x b mod n 

Algorithm: 

1. z = 1 

2. for i = l−1 downto 0 do 

3. z = z 2 mod n 

4. if b i = 1 then z = zx mod n 

Complexity: O(k 3 ), where k = ⌊log 2 n⌋+1 

Example 9.2. Assume, from the previous example, that n = 11413 and b = 3533. Alice wants to encrypt 9726 

so she has to compute 9726 3533 mod 11413. The computation, using the Square-and-Multiply algorithm, 

is shown below. The ciphertext is 5761. 

□ 

i b i 

z 

11 1 1 2 = 9726 

10 1 9726 2 ×9726 = 2659 

9 0 2659 2 = 5634 

8 1 5634 2 ×9726 = 9167 

7 1 9167 2 ×9726 = 4958 

6 1 4958 2 ×9726 = 7783 

5 0 7783 2 = 6298 

4 0 6298 2 = 4629 

3 1 4629 2 ×9726 = 10185 

2 1 10185 2 ×9726 = 105 

1 0 105 2 = 11025 

0 1 11025 2 ×9726 = 5761


Remark 9.3. In practice, the exponentiation in RSA can be done faster. Assume we need to compute x e 

mod n. We shall compute e p = e mod (p−1) and e q = e mod (q−1). Then, we compute x ep mod p and x eq 

mod q. The number we look for, x e mod n is the unique solution z of the system 

{ z ≡ x 

e p 

(mod p) 

z ≡ x eq 

(mod q). 

The exponentiation with a k-bit exponent requires at most 2k multiplications and squarings. (Expected 

(3/2)k.) Then, if p and q have t bits each, computing x e mod n will take approx 2(2t) 3 bit operations. The 

proposed variant takes only 2·2t 3 bit operations, which means it is 4 times faster. 

Notice also that a system of two modular equations can be solved easier than usual. Consider the system 

{ x ≡ a1 (mod p) 

x ≡ a 2 (mod q) 

It has the solution x = (a 1 +p(a 2 −a 1 )(p −1 mod q)) mod (pq). 

Still, if we compare the fastest hardware implementations for RSA and DES we see that symmetric ciphers 

are much faster than public-key ciphers. For instance, RSA can encrypt approx. 600Kbit per second (with a 

512 bit modulus n; i.e., about 154 decimal digits; log 2 10 = 3.3219809...) while DES can encrypt approx. 1 Gbit 

per second. That is, DES is 1500 times faster! 

9.6 Complexity 

Given two problems P 1 and P 2 . We say that P 1 is polynomial-time reducible to P 2 , denoted P 1 ≤ P P 2 , iff 

a polynomial-time algorithm for P 2 gives a polynomial-time algorithm for P 1 

- that is, P 2 is at least as difficult as P 1 

If P 1 ≤ P P 2 and P 2 ≤ P P 1 , then P 1 and P 2 are called computationally equivalent. 

RSA Problem (rsap) 

- given: (n,b,y), n a product of two primes p and q, b a positive integer with gcd(b,(p−1)(q−1)) = 1, and 

y an integer 

- compute: x an integer such that x b ≡ y (mod n) 

Factoring Problem (factoring) 

- given: n a positive integer 

- compute: n = p e1 

1 pe2 2 ...pe k 

k 

its prime factorization 

Theorem 9.4. rsap ≤ P factoring. 

Conjecture 9.5. factoring ≤ P rsap. This means rsap and factoring are computationally equivalent. 

9.7 Randomized algorithms 

In some very real sense, computation is inherently randomized. It can be argued that the probability 

that a computer will be destroyed by a meteorite during any given microsecond of its operation is 

at least 2 −100 . 

Christos Papadimitriou 

Computational Complexity


- decision problem – a problem with yes/no answer 

- deterministic algorithm – no choice during computation – answer is yes or no 

- for a given input, the algorithm has the same execution path whenever it is run 

- P – problems solvable by deterministic algorithms running in polynomial time 

- nondeterministic algorithm – choices during computation – many answers; at least one positive answer 

means yes 

- guess and verify 

- NP – problems solvable by nondeterministic algorithms running in polynomial time 

- coNP – complements of those in NP 

- NP-complete – the hardest problems in NP; if any of those can be solved in polynomial time, then 

all in NP can (there are thousands of NP-complete problems which are believed to have no deterministic 

polynomial-time algorithms) 

- randomized algorithm – random choices 

- the execution path may differ each time the algorithm is run on the same input 

- Monte Carlo algorithms 

- the yes answers are always correct while the no answers might be incorrect 

- (no false positives; yes-biased) 

- the probability of false negatives is at most 1 2 

- the complexity classofproblemswith polynomial-time MonteCarloalgorithmsis denoted RP(randomized 

polynomial time) 

- Las Vegas algorithms 

- the answer is always correct but there might be no answer 

- the complexity class of problems with polynomial-time Las Vegas algorithms is denoted ZPP (zero probability 

of error) 

- ZPP = RP ∩ coRP 

Atlantic City algorithms 

- the probability of right answer is larger than the probability of error 

- complexity class BPP (bounded probability of error) 

Theorem 9.6. P ⊆ ZPP ⊆ RP ⊆ BPP ∩ NP 

coNP 

NP 

NP-complete 

coRP 

ZPP 

RP 

BPP=coBPP 

P 

Figure 11: Complexity classes


9.8 Primality tests 

Composites Problem (composites) 

- given: n a positive integer 

- compute: whether n is composite or not 

Assume p is an odd prime. An integer x is called a quadratic residue modulo p if x ≢ 0 (mod p) and the 

congruence y 2 ≡ x (mod p) has a solution in Z p . x is a quadratic non-residue if x ≢ 0 (mod p) and x is not 

a quadratic residue modulo p. 

If p is prime and a is a quadratic residue modulo p, then the equation x 2 ≡ a (mod p) has exactly two 

solutions (square roots of a modulo p). Indeed, put a ≡ y 2 (mod p). Then x 2 ≡ y 2 (mod p) and so p | 

(x−y)(x+y) and hence x = ±y. 

Theorem 9.7 (Euler’s criterion). Let p be an odd prime. Then x is a quadratic residue modulo p iff 

Proof. If x ≡ y 2 (mod p), then 

x (p−1)/2 ≡ 1 (mod p). 

x (p−1)/2 ≡ (y 2 ) (p−1)/2) 

(mod p) ≡ y p−1 (mod p) ≡ 1 (mod p). 

Conversely, let b be a primitive element modulo p and x ≡ b i (mod p), for some i. We have 

1 ≡ x (p−1)/2 (mod p) ≡ (b i ) (p−1)/2 (mod p) ≡ b i(p−1)/2 (mod p). 

Now p−1 = ord(b) must divide i(p−1)/2 hence i is even and ±b i/2 are the square roots of x. 

□ 

Quadratic Residues Problem (quadratic residues) 

- given: p and odd prime and x an integer 1 ≤ x ≤ p−1 

- compute: whether x is a quadratic residue modulo p or not 

Algorithm: use Euler’s criterion 

Complexity: O(logp) 3 

The Legendre symbol, denoted ( a 

p) 

, is defined, for p an odd prime and a ≥ 0 by 

⎧ 

( a 

⎪⎨ 0 if a ≡ 0 (mod p) 

= 1 if a is a quadratic residue modulo p 

p) 

⎪⎩ 

−1 if a is a quadratic non-residue modulo p 

Theorem 9.8. If p is an odd prime, then 

( a 

≡ a 

p) 

(p−1)/2 (mod p). 

Proof. We have seen in Theorem 9.7 that a is a quadraticresidue modulo p iff a (p−1)/2 ≡ 1 (mod p). Clearly, 

a (p−1)/2 ≡ 0 (mod p) iff a ≡ 0 (mod p). Then, if a is a quadratic non-residue modulo p, then a (p−1)/2 ≡ −1 

(mod p) since a p−1 ≡ 1 (mod p) and a (p−1)/2 ≢ 1 (mod p). 

□ 

We define next a generalization of the Legendre symbol which works for all integers (not necessarily primes). 

The Jacobi symbol, denoted ( a 

n) 

, for n an odd positive integer and a ≥ 0 is defined as follows. Assuming 

n = p e1 

1 pe2 2 ...pe k 

k 

is the prime factorization of n, then 

( a 

k∏ 

( ) ei 

a 

= 

n) 

p i 

i=1


Example 9.9. Consider the Jacobi symbol ( 6278 

9975) 

. Because 9975 = 3×5 2 ×7×19, we have 

( ) ( )( ) 2 ( )( ) ( )( ) 2 ( )( 6278 6278 6278 6278 6278 2 3 6 8 

= 

= 

= (−1)(−1) 

9975 3 5 7 19 3 5 7 19) 

2 (−1)(−1) = −1. 

We shall need to be able to compute the Jacobi symbol and, fortunately, we don’t have to factorize n. The 

properties below help us do this; n is assumed to be an odd integer: 

1. if m 1 ≡ m 2 (mod n) then ( m 1 

) ( 

n = 

m2 

) 

n 

2. ( { 

) 

2 

1 if n ≡ ±1 (mod 8) 

n = 

−1 if n ≡ ±3 (mod 8) 

3. ( ) ( 

m 1m 2 

n = 

m1 

)( 

m2 

) 

n n ; 

- in particular, if m = 2 k t, t odd, then ( ( 

m 

n) 

= 

2 k ( t 

) 

n) 

n 

4. if m,n are odd, then ( ) 

m 

{− ( ) 

n 

n = 

m 

if m ≡ n ≡ 3 (mod 4) 

( n 

m) 

otherwise 

The complexity of this algorithm is O((logn) 3 ). 

Example 9.10. We evaluate below the Jacobi symbol 

( ) 7411 

9283 

( ) 9283 

= − 

( 7411) 

1872 

= − 

7411 

( ) 4 ( ) 2 117 

= − 

( 7411) 

7411 

117 

= − 

( 7411) 

7411 

= − 

117 

(property 4) 

(property 1) 

(property 3) 

(property 2) 

(property 4) 

( ) 7411 

. 

9283 

( ) 40 

= − 

117 

( ) 3 ( ) 2 5 

= − 

( 117 ) 117 

5 

= 

( 117) 

117 

= 

( 5 

2 

= 

5) 

(property 1) 

(property 3) 

(property 2) 

(property 4) 

(property 1) 

= −1 (property 2) □ 

Suppose now n > 1 is odd. If n is prime, then ( a 

n) 

≡ a (n−1)/2 (mod n), for any a. On the other hand, if n 

is composite, it may or may not be the case that ( a 

n) 

≡ a (n−1)/2 (mod n). If this congruence holds, then n is 

called an Euler pseudoprime to the base a. For instance, 91 is an Euler pseudoprime to the base 10. 

It can be shown that, for any odd composite n, n is Euler pseudoprime to the base a for at most half of the 

integers a ∈ Z ∗ n . Also, ( a 

n) 

= 0 iff gcd(a,n) > 1, which means, in the case 1 ≤ a ≤ n−1, that n is composite. 

Solovay-Strassen Primality Test 

- given: n an odd integer 

- computes: whether n is prime (probable) or composite (sure) 

Algorithm: 

1. choose a random integer a,1 ≤ a ≤ n−1 

2. x ← ( ) 

a 

n 

3. if x = 0 then 

4. return (“n is composite”) 

5. y ← a (n−1)/2 mod n 

6. if x ≡ y (mod n) then 

7. return (“n is prime”) 

8. else 


Complexity: O((logn) 3 ) 

□


By the above discussion we have the following theorem. 

Theorem 9.11. The Solovay-Strassen is a yes-biased Monte Carlo algorithm for Composites with probability 

of error 1/2. 

Notice that the probability of interest for us is 

lnn−2 

Prob(n odd composite | alg says ‘n is prime’ m times in succession) ≤ 

lnn−2+2 m+1 

and not 

Prob( alg says ‘n is prime’ m times in succession | n odd composite) ≤ 2 −m 

In practice, one would run the test about 50 to 100 times which would reduce the probability of error to 

something like 0.157×10 −12 or 0.139×10 −27 . 

We present next another primality test algorithm which is faster in practice. 

Miller-Rabin Primality Test 

- given: n an odd integer 

- computes: whether n is prime (probable) or composite (sure) 

Algorithm: 

1. write n−1 = 2 k m, m odd 

2. choose a random integer a,1 ≤ a ≤ n−1 

3. b ← a m mod n 

4. if b ≡ 1 (mod n) then 


6. for i from 0 to k −1 do 

7. if b ≡ −1 (mod n) then 


9. else 

10. b ← b 2 mod n 


Complexity: O((logn) 3 ) 

Even if the order of complexity is the same, in practice, it performs better than Solovay-Strassen algorithm. 

Theorem 9.12. The Miller-Rabin algorithm is a yes-biased Monte Carlo algorithm for Composites with 

probability of error 1/4. 

Proof. (for yes-biased) Assume n is prime but the algorithm answers ‘n is composite’. So, a m ≢ 1 (mod n) 

and also a 2im ≢ −1 (mod n), for all 0 ≤ i ≤ k − 1. As n is prime, by Fermat’s theorem we have a 2km ≡ 1 

(mod n). Hence a 2k−1m is a square root of 1 modulo n, so it is congruent to one of ±1. Thus, a 2k−1m ≡ 1 

(mod n) (as it is not with −1; the only square roots of 1 modulo n are ±1) so again we have a square root of 1 

modulo n. Continuing like this, we finally get that a m ≡ 1 (mod n), a contradiction. 

□ 

9.9 Attacks on RSA 

A first obvious attack is to factor n. Another possible attack is to find φ(n). This is no easier than factoring. 

Indeed, if n and φ(n) are known, then we have n = pq, φ(n) = (p−1)(q−1) and so p 2 −(n−φ(n)+1)p+n = 0 

which gives p and the factorization of n. 

Example 9.13. If n = 84773093 and φ(n) = 84754668 was somehow discovered, then 

p 2 −18426p+84773093= 0 

which has the roots 9539 and 8887. These are the factors of n. 

□ 

Wediscussinthissubsectionsomeofthe mostimportantattacksagainstRSAexceptforfactoringalgorithms 

which are discussed in a separate section.


9.9.1 Decryption exponent 

We shall showthat anyalgorithm to compute the decryptionexponent canbe used as an oraclein a probabilistic 

algorithm for factoring n. This means that computing the decryption exponent is no easier than factoring. In 

particular, it means that if a is revealed, then n is also compromised. Therefore, in such a case, Bob has to 

choose both new and not only the decryption exponent. 

The idea is as follows. If we know a non-trivial squareroot of 1 modulo n, then we can factor n in polynomial 

time. Let us see how. The square roots of 1 modulo n = pq are x with x 2 ≡ 1 (mod n). This is equivalent with 

x 2 ≡ 1 (mod p) and x 2 ≡ 1 (mod q), which, in turn, is equivalent with x ≡ ±1 (mod p) and x ≡ ±1 (mod q). 

Thus, there are four square roots of 1 modulo n; two are trivial, ±1 (mod n), and two are non-trivial, that is, 

the other two (additive inverses of each other). (In general, they can be found using the Chinese Remainder 

Theorem.) 

Assume now x is a non-trivial square root of n = pq. Then n | (x − 1)(x + 1) but n ∤ (x ± 1). Therefore 

gcd(x+1,n) is either p or q; similarly for gcd(x−1,n). Notice that gcd can be computed easily. 

Example 9.14. Assume n = 403 = 13×31. The four square roots of 1 modulo 403 are 1, 92, 311, and 402. 

The square root 92 is the solution of the system 

{ 

x ≡ 1 (mod 13) 

x ≡ −1 (mod 31). 

and the other nontrivial root, 311, is the solution of 

{ x ≡ 1 (mod 31) 

x ≡ −1 (mod 13). 

Now, assuming we know the root 92, we compute gcd(93,403) = 31 or gcd(91,403) = 13. 

□ 

Factoring algorithm using an oracle for decryption exponent 

- given: n = pq product of two odd (unknown) primes and a,b decryption/encryption exponents 

- computes: p and q (probable) 

Algorithm: 

1. write ab−1 = 2 s r,r odd 

2. choose random w,1 ≤ w ≤ n−1 

3. x ← gcd(w,n) 

4. if 1 < x < n then 

5. return ‘success: x,n/x’ 

6. v ← w r mod n 

7. if v ≡ 1 (mod n) then 

8. return ‘failure’ 

9. while v ≢ 1 (mod n) do 

10. v 0 = v 

11. v = v 2 mod n 

12. if v 0 ≡ −1 (mod n) then 


14. else return ‘success: x = gcd(v 0 +1,n),n/x’ 

If we are lucky to find a w which is a multiple of p or q, then we are done in step 5. If not, then w is 

relatively prime to n and we compute w r ,w 2r ,w 4r ,..., by repeated squaring until w 2tr ≡ 1 (mod n). Since 

ab−1 = 2 s r ≡ 0 (mod φ(n)), Euler’s thm gives w 2sr ≡ 1 (mod n) and hence the while loop terminates after at 

most s iterations. At the end of the loop we have found v 0 such that v 2 0 ≡ 1 (mod n) but v 0 ≢ 1 (mod n). If 

v 0 ≡ −1 (mod n), then it gives nothing new and the algorithm fails. If not, then v 0 is a nontrivial square root 

of 1 modulo n and we can factor n as above.


Example 9.15. Suppose n = 89855713, b = 34986517, and a = 82330933. Assume also w = 5. We compute 

ab−1 = 2 3 ×360059073378795. 

We have then 

and it happens that 

Thus, the algorithm will return the value 

w r mod n = 85877701 

85877701 2 ≡ 1 (mod n). 

x = gcd(85877702,n)= 9103. 

The other factor of n is n/9103 = 9871. 

It can be shown that the probability of success is at least 1/2. 

□ 

9.9.2 Wiener’s low decryption exponent attack 

This attack works in the case when 

3a < n 1/4 and q 

This means, if n has l bits in binary, then a has fewer than l/4−1 bits and p and q are not too far apart. 

Notice that Bob might be tempted to choose a small decryption exponent in order to speed up decryption. 

If he chooses a as above, then he saves 75% of the time needed. We prove next that such choices should be 

avoided. 

Since ab ≡ 1 (mod φ(n)), there is t such that 

We have then 


ab−tφ(n) = 1. 

0 < n−φ(n) = p+q −1 < 2q +q −1 < 3q < 3 √ n 

∣ ∣∣∣ b 

n − t ∣ ∣ ∣∣∣ a∣ = ba−tn 

∣∣∣ an ∣ = 1+t(φ(n)−n) 

an ∣ < 3t√ n 

an = 3t 

a √ n . 

Since t < a (because b < φ(n)), we have 3t < 3a < n 1/4 and so 

b 

∣n − t a∣ < 1 

an < 1 

1/4 3a 2. 

Therefore, the fraction t/a is a very close approximation of b/n. We use now the theory of continued fractions 

and deduce that t/a must be one of the convergents in the continued fraction expansion of b/n (see below). 

A (finite) continued fraction is a tuple [q 1 ,q 2 ,...,q m ] of non-negative integers which is a shorthand for 

q 1 + 

1 

q 2 + 

1 

q 3+···+ 1 

qm 

It is not difficult to see that if gcd(a,b) = 1, then a/b can be written as a continued fraction using the quotients 

in the Euclidean algorithm. We shall give only an example. 

Example 9.16. Consider the fraction 34/99. In the Euclidean algorithm we have 

34 = 0×99+34 

99 = 2×34+31 

34 = 1×31+3 

31 = 10×3+1 

3 = 3×1.


The continued fraction expansion will be [0,2,1,10,3], i.e., 

34 

99 = 0+ 1 

2+ 

1 

1+ 1 

10+ 1 3 

For the continued fraction [q 1 ,...,q m ], the continued fractions [q 1 ,...,q j ], 1 ≤ j ≤ m, are called its convergents. 

Example 9.17. The convergents of the continued fraction in the example above are 

[0] = 0 

[0,2] = 1/2 

[0,2,1] = 1/3 

[0,2,1,10] = 11/32 

[0,2,1,10,3] = 34/99. 

For our attack we shall use the following result from the theory of continued fractions. 

Lemma 9.18. If gcd(a,b) = gcd(c,d) = 1 and 

∣ a b − c ∣ < 1 

d 2d 2, 

then c/d is one of the convergents of the continued fraction expansion of a/b. 

This lemma gives us that the unknown fraction t/a must be one of the convergents of the continued fraction 

expansion of b/n; notice that b/n is publicly known. All we need to do is to test each convergent to see if it is 

the right one. 

Wiener’s algorithm 

- given: n = pq product of two odd (unknown) primes 

- computes: p and q if the conditions for Wiener’s algorithm are satisfied 

Algorithm: 

1. (q 1 ,q 2 ,...,q m ) ← EuclideanAlg(n,b) 

2. c 0 ← 1 

3. c 1 ← q 1 

4. d 0 ← 0 

5. d 1 ← 1 

6. j ← 1 

7. while j ≤ m do 

8. n ′ ← (d j b−1)/c j [n ′ = φ(n) if c j /d j is the right convergent] 

9. if n ′ is an integer then 

10. let p and q be the roots of the equation 

11. x 2 −(n−n ′ +1)x+n = 0 

12. if p and q are positive integers less than n then 

13. return (p,q) 

14. j ← j +1 

15. c j ← q j c j−1 +c j−2 

16. d j ← q j d j−1 +d j−2 


. 

□ 

□


Example 9.19. Suppose n = 160523347 and b = 60728973. The continued fraction expansion of b/n is 

[0,2,1,1,1,4,12,102,1,1,2,3,2,2,36]. 

The first few convergents are 

0, 1 2 , 1 3 , 2 5 , 3 8 , 14 

37 . 

It can be verified that the convergent which produces a factorization is 14/37 which yields 

If we now solve the equation 

n ′ = 37×60728973−1 

14 

= 160498000. 

x 2 −25348x+160523347= 0, 

then we find the roots 12347 and 13001. We have then the factorization 

n = 12347×13001. 

Notice that for the modulus n = 160523347, Wiener’s algorithm will work for 

9.9.3 Partial information about plaintext bits 

a < 1 3 n1/4 ≈ 37.52. 

So far we considered total break of the system. We consider here a more modest goal the adversary might 

have. He might want to find out only some partial information about the plaintext x revealed by the ciphertext 

y = e K (x). One example of such information is the Jacobi symbol 

( ( ) b ( x x y 

= = 

n) 

n n) 

which can be computed without knowing x. We consider in this subsection some other types of information 

about the plaintext, such as: { 

0, if x is even 

- the low order bit of plaintext: parity(y) = 

1, if x is odd 

{ 

0, if 0 ≤ x < n/2 

- in which half of n is x; half(y) = 

1, if n/2 < x ≤ n−1 

We shall prove in this section that computing parity or half is polynomially equivalent with determining the 

plaintext. 

First we notice that parity and half are polynomially equivalent. This holds because 

- half(y) = parity(y ×e K (2) mod n) 

- parity(y) = half(y ×e K (2 −1 ) mod n) 

Next we give an algorithm which computes the plaintext in polynomial time, given an oracle for half. 

□


RSA decryption algorithm using an oracle for half 

- given: a cipher text y = e K (x) 

- computes: x using half 

Algorithm: 

1. k ← ⌊log 2 n⌋ 

2. for i from 0 to k do 

3. h i ← half(y) 

4. y ← (y ×e K (2)) mod n 

5. lo ← 0 

6. hi ← n 


8. mid ← (hi+lo)/2 

9. if h i = 1 then lo ← mid 

10. else hi ← mid 

11. return (⌊hi⌋) 

We notice that the RSA encryption function satisfies the following multiplicative property 

e K (x 1 x 2 ) = e K (x 1 )e K (x 2 ). 

Therefore, in the ith iteration of the first loop, we have 

h i = half(y ×(e K (2)) i ) = half(e K (x×2 i )). 

We observe that [ 

half(e K (x)) = 0 iff x ∈ 0, n ) 

[ 2 

half(e K (2x)) = 0 iff x ∈ 0, n ) [ n 

∪ 

4 2 , 3n ) 

[ 4 

half(e K (4x)) = 0 iff x ∈ 0, n ) [ n 

∪ 

8 4 , 3n ) [ n 

∪ 

8 2 , 5n ) [ 3n 

∪ 

8 4 , 7n ) 

8 

and so on. Hence we find x by a binary technique. 

Example 9.20. Assume n = 1457, b = 779, and y = 722. The search proceeds as below; the plaintext is 

x = ⌊999.55⌋= 999. 

i h i lo mid hi 

0 1 0.00 728.50 1457.00 

1 0 728.50 1092.75 1457.00 

2 1 728.50 910.62 1092.75 

3 0 910.62 1001.69 1092.75 

4 1 910.62 956.16 1001.69 

5 1 956.16 978.92 1001.69 

6 1 978.92 990.30 1001.69 

7 1 990.30 996.00 1001.69 

8 1 996.00 998.84 1001.69 

9 0 998.84 1000.26 1001.69 

10 0 998.84 999.55 1000.26 

998.84 999.55 999.55 

□


10 FACTORING ALGORITHMS 

- special purpose algorithms: running time depends on some properties of the number n to be factored 

- general purpose algorithms: running time depends on n only 

10.1 Trial division 

If n is composite, then it has a factor which is smaller than √ n. Trial division tries all odd integers up to √ n. 

In the worst case, O( √ n) divisions are performed. 

10.2 Pollard’s p−1 algorithm 

- for n such that n−1 has only small factors 

Pollard’s p − 1 algorithm for factoring integers 

- given: n and B two integers 

- computes: a non-trivial factor of n 

Algorithm: 

1. a = 2 

2. for j from 2 to B do 

3. a ← a j mod n 

4. d ← gcd(a−1,n) 

5. if 1 < d < n then return ‘success: d’ 

6. else return ‘failure’ 

Complexity: O(B) modular exponentiations each requiring O(logB) modular multiplications (square and 

multiply) plus the gcd: altogether O(BlogB(logn) 2 +(logn) 3 ) 

- for B large, this can be √ n 

- idea: assume p is a prime divisor of n such that q ≤ B for every prime power q which divides p−1 

- then (p−1) | B! 

- before step 4 (at the end of for in steps 2 and 3), we have a ≡ 2 B! (mod n) and therefore a ≡ 2 B! (mod p) 

- by Fermat’s theorem, 2 p−1 ≡ 1 (mod p) 

- hence a ≡ 1 (mod p) 

- thus p | (a−1) and so p | d = gcd(a−1,n) which implies that d is a non-trivial divisor of n 

Example 10.1. Assume n = 15770708441 and use B = 180 

- we find in step 3 that a = 11620221425 has gcd(a−1,n) = 135979=d 

- n = 135979×115979 

- the success is due to the fact that 135978 has only small prime factors: 

135978= 2×3×131×173 

- therefore, any B ≥ 173 is good □ 

- primes for RSA 

- we have to choose n = pq, p,q primes such that p−1 and q −1 do not have only small factors 

- we can choose p and q such that p = 2p 1 +1, q = 2q 1 +1 with p 1 and q 1 primes also


10.3 Pollard’s rho algorithm 

- idea: compute x 1 = 2,x 2 = x 2 1 +1 mod n,x 3 = x 2 2 +1 mod n,... 

- if 1 < gcd(x i −x j ,n) < n, then we found a divisor of n 

- that is: we want to find two x i ’s which are in different residue classes modulo n but in the same residue 

class modulo a divisor of n 

- improvement: we need not compute all gcd(x i −x j ,n); 

- if x i ≡ x j mod r, for some r | n, then also x i+k ≡ x j+k mod r 

Pollard’s rho algorithm for factoring integers 

- given: n an integer 


Algorithm: 

1. a = 2, b = 2 

2. for i = 1,2,3,... do 

3. compute a = a 2 +1 mod n, b = b 2 +1 mod n, b = b 2 +1 mod n 

4. compute d = gcd(a−b,n) 

5. if 1 < d < n then return ‘success: d’ 

6. if d = n then return ‘failure’ 

Complexity: assuming x 2 +1 behave like a random function, the expected running time is O(n 1/4 ) modular 

multiplications 

Example 10.2. Assume n = 455459; we have the values of a and b: 

a b d 

5 26 1 

26 2871 1 

677 179685 1 

2871 155260 1 

44380 416250 1 

179685 43670 1 

121634 164403 1 

155260 247944 1 

44567 68343 743 

- finally 455459=743×613 □ 

The name of the algorithm come from the fact that, if we consider the sequence x 1 mod p,x 2 mod p,..., 

then at some point a value will be repeated, producing a graph whose shape resembles the letter ρ. For the 

above examples we have: 

10.4 Random square factoring 

5 → 26 → 677 → 642 → 543 → 622 → 525 → 716 → 730 → 169 

↑ 

↓ 

200 ←− 399 ←− 576 ←− 667 

- idea: find x and y such that x 2 ≡ y 2 (mod n) but x ≢ ±y (mod n); then n | (x−y)(x+y) but n does not 

divide either of x−y and x+y; therefore gcd(x−y,n) is a non-trivial factor of n 

Dixon’s algorithm 

- given: n an integer



Algorithm: 

1. choose a factor base B = {p 1 ,p 2 ,...,p t } (the first t primes) 

2. find t+1 pairs (a i ,b i ), 1 ≤ i ≤ t+1 (by random testing) such that 

(i) a 2 i ≡ b i (mod n) 

(ii) b i is p t -smooth (that is, b i = ∏ t 

j=1 peij j ) 

3. find a subset of the b i ’s whose product is a perfect square 

- we need only the parity of exponents (we have factorizations of b i ’s) 

- associate v i = (v i1 ,...,v it ) with (e i1 ,...,e it ) where v ij = e ij mod 2 

- v 1 ,...,v t+1 must be linearly dependent over (Z 2 ) t ; say ∑ i∈T v i = 0 

- then ∏ i∈T b i is a perfect square 

- put x = ∏ i∈T a i, y = the square root of ∏ i∈T b i; then x 2 ≡ y 2 (mod n) 

4. if x ≢ ±y (mod n) then return ‘success: gcd(x−y,n)’ 

5. else find other pairs of dependences and try again 

- in practice, there will be several dependencies 

- also we can find more than t+1 pairs, to be sure we have more dependences 

Example 10.3. Assume n = 15770708441 and choose B = {2,3,5,7,11,13}. Consider the congruences below 

with the corresponding vectors: 

8340934156 2 ≡ 3×7 (mod n) (0,1,0,1,0,0) 

12044942944 2 ≡ 2×7×13 (mod n) (1,0,0,1,0,1) 

2773700011 2 ≡ 2×3×13 (mod n) (1,1,0,0,0,1) 

The sum of the three vectorsis easily seen to be congruent with (0,0,0,0,0,0)modulo 2. Therefore, the product 

of the three congruences will give: 

(8340934156×12044942944×2773700011) 2 ≡ (2×3×7×13) 2 (mod n), 

that is 

9503435785 2 ≡ 546 2 (mod n). 

We compute then 

gcd(9503435785−546,15770708441)= 115979 

which is a factor of n = 135979×115979. 

□ 

10.5 Quadratic sieve algorithm 

- idea: to obtain a i ’s such that b i ’s are small; when b i ’s are small, it is more likely that they are p t -smooth 

- let m = ⌊ √ n⌋ 

- test a i of the form a i = m+x with b i = (x+m) 2 −n 

- notice that a 2 i ≡ b i (mod n) 

- also, when x is small, (x+m) 2 −n = x 2 +2mx+m 2 −n ≈ x 2 +2mx which is also small 

- trade-off: when t is large, we have better chances to have p t -smooth integers but we need to accumulate 

more congruences to obtain a dependence relation 

- optimal choice for t is approximately √ 

e √ lnnlnlnn 

- for this we get the expected running time 

O 

(e ) 

(1+o(1))√ lnnlnlnn


10.6 The best current factoring algorithms 

quadratic sieve 

elliptic curve 

number field sieve 

( ) 

O e (1+o(1))√ lnnlnlnn 

( ) 

O e (1+o(1))√ 2lnplnlnp 

O 

(e (1.92+o(1))(lnn)1/3 (lnlnn) 2/3) 

- o(1) approaches 0 as n goes to infinity and p is the smallest prime factor of n 

- in the worst case, p ≈ √ n, and so asymptotically the quadratic sieve and elliptic curve do the same 

- in general quadratic sieve outperforms elliptic curve 

- elliptic curve is better for prime factors of different size 

- number field sieve has the best asymptotical running time 

- but (it seems) it is better for number of 130 decimal digits or more 

10.7 Factoring RSA moduli 

Here is a list of numbers which have been factored or for which prices are oferred: 

number digits prize factored 

RSA-100 100 Apr. 1991 

RSA-110 110 Apr. 1992 

RSA-120 120 Jun. 1993 

RSA-129 129 $100 Apr. 1994 

RSA-130 130 Apr. 10, 1996 

RSA-140 140 Feb. 2, 1999 

RSA-150 150 withdrawn open 

RSA-155 155 Aug. 22, 1999 

RSA-160 160 Apr. 1, 2003 

RSA-576 174 $10,000 Dec. 3, 2003 

RSA-640 193 $20,000 Nov.2, 2005 

RSA-704 212 $30,000 open 

RSA-768 232 $50,000 Dec.12, 2009 

RSA-896 270 $75,000 open 

RSA-1024 309 $100,000 open 

RSA-1536 463 $150,000 open 

RSA-2048 617 $200,000 open 

The two 87-digit factors of RSA-576 are: 

3980750 8642406493 7397125500 5503864911 9906436234 2526708406 3851895759 4638895726 1768583317 

4727721 4610743530 2536223071 9730482246 3291469530 2097116459 8521711305 2071125636 3590397527


11 OTHER PUBLIC-KEY CRYPTOSYSTEMS 

We present in this section two other public-key ciphers: Rabin and ElGamal. 

11.1 Rabin cryptosystem 

The Rabin cryptosystem provides an example of a provably secure cryptosystem. Breaking the system is 

provably as difficult as factoring the modulus. 

The Rabin Cryptosystem 

P = C = Z ∗ n; n = pq, p,q primes, p ≡ 3 (mod 4), q ≡ 3 (mod 4) 

K = {(n,p,q) | n = pq}. 

public: n 

private: p,q 

encryption: e K (x) = x 2 mod n 

decryption: d K (y) = √ y mod n 

Note: the requirements p ≡ 3 (mod 4), q ≡ 3 (mod 4), and P = C = Z ∗ n can be omitted. They simplify the 

analysis. 

Onedrawbackofthe Rabincryptosystemisthat theencryptionfunction isnotaninjectionandsodecryption 

cannot be done in an unambiguous fashion. Assume y is a valid ciphertext. The ambiguity comes from the fact 

that there are four square roots of y modulo n (see below). In general, Bob has no way to see which one of 

these is the correct plaintext unless it contains sufficient redundancy to eliminate the three wrong possibilities. 

Bob has to solve the equation 

x 2 ≡ y (mod n). 

This is equivalent to solving the two congruences 

z 2 ≡ y (mod p) and z 2 ≡ y (mod q). 

We can use Euler’s criterion to determine if y is a quadratic residue modulo p (and modulo q). If the encryption 

was done correctly, it will be. Euler’s criterion does not help finding the roots. The special form of p and q 

makes this simple. We have 

(±y (p+1)/4 ) 2 ≡ y (p+1)/2 (mod p) 

≡ y (p−1)/2 y (mod p) 

≡ y (mod p) 

The two square roots of y modulo p are ±y (p+1)/4 mod p. Similarly, the ones modulo q are ±y (q+1)/4 mod p. 

The four square roots of y modulo n are obtained using the Chinese remainder theorem. 

Example 11.1. Assume n = 77 = 7×11. The encryption function is 

and the decryption function is 

Suppose Bob has to decrypt y = 23. We have first 


e K (x) = x 2 mod 77 

d K (y) = √ y mod 77. 

23 (7+1)/4 ≡ 2 2 ≡ 4 (mod 7) 

23 (11+1)/4 ≡ 1 3 ≡ 1 (mod 11). 

Using Chinese remainder theorem, we compute the four square roots of 23 modulo 77 to be ±10,±32 mod 77. 

The four possible plaintexts are x = 10,32,45,67. 

□


11.2 Security of Rabin cryptosystem 

We shall prove that a decryption oracle Rabin-Decrypt can be incorporated into a Las Vegas algorithm that 

factors the modulus n with probability at least 1/2. That means that any algorithm able to decrypt can be 

used to factor the modulus or, put otherwise, decrypting is no easier than factoring. 

Factoring a Rabin modulus, given a decryption oracle 

- given: n = pq, p,q primes congruent to 3 modulo 4 

- computes: p or q using Rabin-Decrypt 

Algorithm: 

1. 

2. 

choose a random r ∈ Z ∗ n 

y ← r 2 mod n 

3. x ← Rabin-Decrypt(y) 

4. if x 1 ≡ ±r (mod n) then 


6. else 

7. p ← gcd(x+r,n) 

8. q ← n/p 

9. return ‘success: n = p×q’ 

Notice that y is a valid ciphertext and so Rabin-Decrypt will return one out of four possible plaintexts. 

Those are in fact ±r (mod n) and ±ωr (mod n), where ω is one of the nontrivial square roots of 1 modulo n. 

For the latter ones we have x 2 ≡ r 2 (mod n) but x ≢ ±r (mod n) and we can factor n. 

It is clear that the probability of success is 1/2. 

We need to clarify a very important point. We just proved the Rabin cryptosystem secure against ciphertext 

onlyorchosenplaintextattacks. However,itiscompletelyinsecureagainstchosenciphertextattack. Thissimply 

because the above algorithm works very well with the decryption algorithm instead of the Rabin-Decrypt 

oracle. (The security proof says that a decryption oracle can be used to factor n and a chosen ciphertext attack 

assumes that a decryption oracle exists!) This problem can be avoided by adding redundancy to the plaintext; 

e.g., last 64 bits are repeated. 

11.3 ElGamal cryptosystem 

The ElGamal cryptosystem is based on DiscreteLogarithm problem which is believed to be difficult. The 

trapdoor one-way function is modular exponentiation. 

Discrete Logarithm Problem (discretelog) 

- given: p a prime, α ∈ Z ∗ p primitive, β ∈ Z∗ p 

- compute: log α β = a,0 ≤ a ≤ p−2 such that α a ≡ β (mod p) 

ElGamal Cryptosystem 

P = Z ∗ p; C = Z ∗ p ×Z ∗ p; p prime, α ∈ Z ∗ p primitive 

K = {(p,α,a,β) | β ≡ α a (mod p)}. 

public: p,α,β 

private: a 

encryption: e K (x,k) = (y 1 ,y 2 ) = (α k mod p,xβ k mod p) 

- k ∈ Z p−1 is a secret random number 

decryption: d K (y 1 ,y 2 ) = y 2 (y a 1 )−1 mod p


Notice that the encryption operation is randomized since the ciphertext depends on both the plaintext x and 

on a random value k chosen by Alice. There will be many ciphertexts (precisely p−1) which are encryptions of 

the same plaintext. The plaintext x is said to be masked by β k . Bob can compute β k ≡ (α a ) k ≡ (α k ) a mod p 

because he knows a. Then he removes the mask dividing y 2 by β k and obtains x. 

Example 11.2. Assume p = 2579, α = 2, and a = 765. Then 

β = 2 765 mod 2579 = 949. 

Suppose Alice encrypts the message x = 1299 with the random k = 853. She computes 


Bob receives the ciphertext (435,2396) and computes 

y 1 = 2 853 mod 2579 = 435 

y 2 = 1299×949 853 mod 2579 = 2396. 

x = 2396×(435 765 ) −1 mod 2579 = 1299. 

Conjecture 11.3. Security of ElGamal cryptosystem is equivalent to the discretelog problem. 

Note: one way is obvious. 

□


12 ALGORITHMS FOR DISCRETE LOGARITHM 

- exhaustive search 

- compute α 0 ,α 1 ,α 2 ,... until β is found 

- O(p) multiplications – inefficient for p large 

12.1 Shank’s baby-step giant-step algorithm 

- idea: if m = ⌈ √ p−1⌉ and a = jm+i, then 

α a = α jm α i which implies βα −i = α mj 

Shank’s algorithm for discretelog problem 

- given: p a prime, α ∈ Z ∗ p primitive, β ∈ Z ∗ p 

- computes: log α β 

Algorithm: 

1. put m = ⌈ √ p−1⌉ 

2. compute α mj mod p, 0 ≤ j ≤ m−1 (giant steps) 

3. sort the pairs (j,α mj mod p) by the second component in a list L 1 

4. compute βα −i mod p, 0 ≤ i ≤ m−1 (baby steps) 

5. sort the pairs (i,βα −i mod p) by the second component in a list L 2 

6. find two pairs, (j,y) ∈ L 1 and (i,y) ∈ L 2 (same second component) 

7. return log α β = mj +i mod (p−1) 

Complexity – O( √ p) multiplications 

12.2 Pohlig-Hellman algorithm 

- idea: use the factorization of the order of α: p−1 = ∏ k 

- we compute a = log α β mod (p−1) 

- it is enough to 

i=1 pci i 

- compute a mod p ci 

i 

for all 1 ≤ i ≤ k and 

- then use Chinese Remainder Theorem to get a mod (p−1) 

- computation of x = a mod q c , where q c | p−1 but q c+1 ∤ p−1 

∑c−1 

- write x in base q: x = a i q i , 0 ≤ a i ≤ q −1 for all i 

i=0 

- put also a = x+q c s, for some s 

- compute a 0 

- this is done using 

β (p−1)/q ≡ α (p−1)a0/q (mod p) 

- why this: 

- first β (p−1)/q ≡ α (p−1)(x+qc s)/q (mod p) 

- it suffices to show 1 q (p−1)(x+qc s) ≡ 1 q (p−1)a 0 (mod p−1)


- this is true because: 

1 

q (p−1)(x+qc s)− 1 q (p−1)a 0 = 1 q (p−1)(x+qc s−a 0 ) 

(c−1 

= 1 q (p−1) ∑ ) 

a i q i +q c s 

= (p−1) 

i=1 

(c−1 

∑ ) 

a i q i−1 +q c−1 s 

i=1 

≡ 0 (mod p−1) 

- how is a 0 computed 

- compute first β (p−1)/q mod p 

- if this is 1, then a 0 = 0 

- if not, then compute γ = α (p−1)/q mod p, γ 2 mod p,... 

until γ i ≡ β (p−1)/q (mod p) 

- put then a 0 = i 

- if c = 1, we are done, if not we continue with computing a 1 

- compute a 1 – similarly 

- get rid of a 0 : put β 1 = βα −a0 

- put also x 1 = log α β 1 mod q c 

∑c−1 

- we have x 1 = a i q i 

i=1 

- then β (p−1)/q2 

1 ≡ α (p−1)a1/q (mod p) 

- compute β (p−1)/q2 

1 mod p 

- find i such that γ i ≡ β (p−1)/q2 

1 (mod p) 

- this i will be a 1 

- we repeat this for finding a 2 ,a 3 ,..., 

Pohlig-Hellman algorithm 

- given: p prime, q prime, q c | p−1, q c+1 ∤ p−1, α primitive modulo p 

- computes: log α β mod q c 

Algorithm: 

1. compute γ i = α (p−1)i/q mod p, for 0 ≤ i ≤ q −1 

2. put β 0 = β 

3. for j = 0 to c−1 do 

4. compute δ = β (p−1)/qj+1 

j mod p 

5. find i such that δ = γ i 

6. a j = i 

7. β j+1 = β j α −ajqj mod p 

8. return a 0 ,a 1 ,...,a c−1 

- useful for p−1 having small prime factors only


13 HASH FUNCTIONS AND MESSAGE AUTHENTICATION 

13.1 Data integrity and hash functions 

One of the goal of cryptography is data integrity. A (cryptographic) hash function can provide assurance of 

data integrity. A hash function is used to construct a short “fingerprint” of data; if the data is altered, then the 

fingerprint will no longer be valid. Even if the data is stored in an insecure place, its integrity can be checked 

by recomputing its fingerprint. We assume the fingerprint is stored in a secured place. 

If h is a hash function and x is some data, then the fingerprint is y = h(x) and is referred to as a message 

digest (or authentication tag). A message digest is usually a fairly short binary string; commonly 160 bits. A 

very important application of hash functions is in the context of digital signatures. 

It is also very useful to have keyed hash functions. They are used as message authentication codes or MACs. 

We assume Alice and Bob share a common secret key K which determines a hash function h K . For a message 

x, the fingerprint is y = h K (x) and can be computed by both Alice and Bob. Now both the message and the 

fingerprint (x,y) can be sent over an insecure channel from Alice to Bob. Bob will verify that y = h K (x). 

Of course, we need to assume that the hash functions, keyed or not, are “secure” in a sense to be made 

precise. 

A hash family is a 4-tuple (X,Y,K,H) where X is the set of messages, Y is the set of message digests, K is 

the set of keys, and for each K ∈ K, there is a hash function h K ∈ H, h K : X → Y. The set X can be finite 

or infinite but Y is always finite. It X is finite, then the hash function is called compression function and we 

shall assume |X| ≥ |Y|. A pair (x,y) is called a valid pair under the key K if h K (x) = y. The most important 

property of hash functions is that they have to prevent the constructions of certain valid pairs by the adversary. 

The set of functions from X to Y is denoted Y X . Clearly, if |X| = N and |Y| = M, then there are M N such 

functions; the family is then called an (N,M)-hash family. 

A simple example of a hash function is as follows. Divide the message into blocks of the same size and 

then xor all of them. A variant is to rotate the intermediate hash value before xor-ing with the next block; see 

Fig 11.8. 

It is easy to see that none of these is a good hash function. The adversary can simply choose any message 

and then append a last block to it such that it has any given message digest. 

13.2 Properties of hash functions 

Assume h : X → Y is an unkeyed hash function. We define several problems related to the security of hash 

functions. The idea is that a valid pair (x,y) should be possible to construct only by choosing first x and then 

computing y = h(x) and not otherwise. In particular, it should not be possible to construct new valid pairs 

using old ones. Consider for instance the hash function h : Z n ×Z n → Z n , given by h(x,y) = ax+by mod n,


for fixed a,b ∈ Z n . If the adversary has two valid pairs h(x 1 ,y 1 ) = z 1 and h(x 2 ,y 2 ) = z 2 , then he can compute 

further valid pairs as follows: 

h(rx 1 +sx 2 mod n,ry 1 +sy 2 mod n) = rz 1 +sz 2 mod n. 

Therefore, this hash function is not secure. 

We give next some problems which have to be computationally infeasible for secure hash functions. 

Preimage: Given h : X → Y and y ∈ Y, find x ∈ X such that h(x) = y. 

If the Preimage problem is difficult to solve for a hash function h, then h is called preimage resistant or 

one-way. 

Second Preimage: Given h : X → Y and x ∈ X, find x ′ ∈ X such that x ′ ≠ x and h(x ′ ) = h(x). 

If the Second Preimage problem is difficult to solve for a hash function h, then h is called second preimage 

resistant (or sometimes weak collision resistant). 

Collision: Given h : X → Y, find x,x ′ ∈ X such that x ′ ≠ x and h(x ′ ) = h(x). 

If the Collision problem is difficult to solve for a hash function h, then h is called collision resistant (or 

sometimes strong collision resistant). 

13.3 Security of hash functions 

In order to analyze the complexity of algorithms for the three problems in the previous section, we shall consider 

the following so called random oracle model which provides a mathematical model of an “ideal” hash function. 

In this model a hash function h : X → Y is chosen randomly and we have only oracle access to h. That means 

we are not given an algorithm to compute values of h. The only way to do that is to question an oracle. 

We have therefore the following independence property: if h is randomly chosen and X 0 ⊆ X such that the 

values h(x) were determined (by querying an oracle for h) iff x ∈ X 0 , then Prob(h(x) = y) = 1/M for all 

x ∈ X −X 0 and all y ∈ Y. 

The algorithms below are randomized; i.e., they can make random choices during their execution. We shall 

call (ǫ,q)-algorithm a Las Vegas algorithm with average-case success probability ǫ which can make at most q 

queries to the oracle. 

FindPreimage(h,y,q) 

- given: h hash function, y message digest, q maximum number of oracle queries 

- computes: a preimage x or fail 

Algorithm: 

1. choose X 0 ⊆ X with |X 0 | = q 

2. for each x ∈ X 0 do 

3. if h(x) = y then return x 

4. return fail 

The average-case success complexity for the algorithm FindPreimage is 

ǫ = 1−(1− 1 M )q 

(which, for q small compared to m, is approximately q/M). To see this, let X 0 = {x 1 ,...,x q } and let E i be the 

event “h(x i ) = y.” From the independence property we have Prob(E i ) = 1/M and so 

Prob(E 1 ∨···∨E q ) = 1−(1− 1 M )q .


FindSecondPreimage(h,x,q) 

- given: h hash function, x message, q maximum number of oracle queries 

- computes: a second preimage x 0 or fail 

Algorithm: 

1. y ← h(x) 

2. choose X 0 ⊆ X −{x} with |X 0 | = q −1 

3. for each x 0 ∈ X 0 do 

4. if h(x 0 ) = y then return x 0 


The average-case success complexity for the algorithm FindSecondPreimage is 

ǫ = 1−(1− 1 M )q−1 . 

FindCollision(h,q) 

- given: h hash function, q maximum number of oracle queries 

- computes: a collision (x,x ′ ) or fail 

Algorithm: 

1. choose X 0 ⊆ X with |X 0 | = q 

2. for each x ∈ X 0 do 

3. y x ← h(x) 

4. if y x = y x ′ for some x ≠ x ′ then 

5. return (x,x ′ ) 


The average-case success complexity for the algorithm FindCollision is 

ǫ = 1−(1− 1 M )(1− 2 q −1 

)···(1− 

M M ). 

To see this, let X 0 = {x 1 ,...,x q } and let E i be the event “h(x i ) ∉ {h(x 1 ),...,h(x i−1 )}.” We have that 

Therefore, 

Prob(E i | E 1 ∧···∧E i−1 ) = M −i+1 

M . 

Prob(E 1 ∧···∧E q ) = ( M −1 

M 

which implies our result. 

As seen above, the probability to find a collision is 

)(M 

−2 

M 

1−(1− 1 M )(1− 2 q −1 

)···(1− 

M M ). 

−q +1 

)···(M ). 

M 

For x small, we have e −x = 1−x+ x2 

2! 

− x3 

3! ··· ≈ 1−x. Therefore, the probability of finding no collisions is 

approximately 

q−1 

∏ 

(1− i q−1 

M ) ≈ ∏ 

e − i M = e 

− ∑ q−1 

i=1 i M = e 

− q(q−1) 

2M . 

Therefore, the probability of finding at least one collision is 

i=1 

i=1 

ǫ ≈ 1−e −q(q−1) 2M .


solving for q, we have 

and ignoring q gives 

For ǫ = 0.5 we get 

q 2 −q ≈ 2M ln 1 

1−ǫ 

q ≈ 

√ 

2M ln 1 

1−ǫ . 

q ≈ 1.17 √ M. 

This means that approximately √ M random elements of X yield a collision with probability 1/2. The birthday 

paradox is obtained for M = 365 which gives q = 22.3. So, the probability that 2 people among 23 randomly 

chosen have the same birthday is 1/2 (This is no paradox but it is probably unexpected.) From this example, 

the attack which tries a high number of random choices attempting to find a collision is called birthday attack. 

Size of message digests. The birthday attack imposes a lower bound on the size of secure message digests. 

A 40-bit message digest would be very insecure since a collision would be found with probability 1/2 just over 

2 20 ≈ 10 6 random hashes. Minimum acceptable is 128 bits but 160-bit message digests are recommended. 

Comparison of security criteria. Solvingthe Collisionproblemis easierthan PreimageorSecondPreimage. 

The former required a number of hashes proportional to √ M while the latter two needed a number of hashes 

which is linear in M. 

13.4 Iterated hash functions 

So far we have considered hash functions with a finite domain (compressions functions). In practice we need 

hash functions with very large domains. We show next a technique which uses a compression function to build 

a hash function with infinite domain. The compression function is used repeatedly and the obtained function is 

called iterated hash function. The basic principle of this construction applies to most hash functions currently 

in use. We shall assume all messages are binary. 

Assume we have a compression function f : {0,1} n+b → {0,1} n and an input string x. We first pad x at 

the end such that its length becomes a multiple of b and then break the obtained string into blocks of length 

b each; the blocks are y 1 ,y 2 ,.... Then, each block y i is appended at the end the message digest from the 

previous compression (of length n) and the result is compressed again using the compression function. The last 

compression gives the message digest; see Fig 11.10. Usually, x is appended also its length at the end. 

It is essential to notice that if the compression function is secure then so is the iterated function. We show 

below aprecisesuchconstructionofan iteratedfunction forwhich it canbe provedthat the securityis preserved.


Assume compress : {0,1} m+t → {0,1} m is a collision resistant compression function. We shall use compress 

to construct a collision resistant hash function 

h : 

∞⋃ 

i=m+t+1 

{0,1} i → {0,1} m . 

We shall assume t ≥ 2 but the construction can also be done for t = 1. The construction is shown in the 

algorithm below. 

Merkle-Damgård(x) 

- given: compress collision resistant function, x message 

- computes: h(x) message digest 

Algorithm: 

1. n ← |x|, k ← ⌈n/(t−1)⌉, d ← k(t−1)−n d is the length to be padded 

2. put x = x 1 ‖x 2 ‖···‖x k , with |x i | = t−1, 1 ≤ i ≤ k −1 

3. for i from 1 to k −1 do 

4. y i ← x i the first k −1 blocks 

5. y k ← x k ‖0 d the last block is padded 

6. y k+1 ← 0 t−1−|binary(d)| binary(d) length of padding is appended 

7. z 1 ← 0 m+1 ‖y 1 initial value 

8. g 1 ← compress(z 1 ) 


10. z i+1 ← g i ‖1‖y i+1 next string to be compressed 

11. g i+1 ← compress(z i+1 ) 

12. h(x) ← g k+1 last compression gives the digest 

13. return h(x) 

It can be proved that if compress is collision resistant, then h is collision resistant. The idea is, given a 

collision for h, a collision for compress can be found in polynomial time. 

13.5 MD5 

– see textbook 

13.6 SHA-1 


13.7 RIPEMD-160 


13.8 Message authentication codes 

A common wayof constructing a MAC is to incorporateasecret keyinto an unkeyed hash function, by including 

it as a part of the message to be hashed. However, this should be done carefully. We show below some possible 

pitfalls. 

Let h : {0,1} m+t → {0,1} m be un unkeyed iterated hash function. Assume the key has m bits and is 

incorporated as the initial vector IV. An opponent can construct a valid MAC for a certain message as follows, 

assuming he knows a pair (x,h K (x)). For any t-bit string x ′ , the MAC for the message x‖x ′ is 

h K (x‖x ′ ) = compress(h K (x)‖x ′ ).


We assumed above that messages are not padded; their length was assumed already a multiple of t. But even 

if messages are padded, a modification of the above attack can be carried out. Assume y = x‖pad(x). Let w be 

a bit string of length t and put 

x ′ = x‖pad(x)‖w. 

We have 

y ′ = x ′ ‖pad(x ′ ) = x‖pad(x)‖w‖pad(x ′ ). 

Also |y ′ | = r ′ t and |y| = rt where r ′ > r. When computing h K (x ′ ), we have 

z r+1 ← compress(h K (x)‖y r+1 ) 

z r+2 ← compress(z r+1 ‖y r+2 ) 

. 

z r ′ ← compress(z r′ −1‖y r ′. 

So, again the opponent can compute h K (x ′ ) without knowing K. 

13.9 CBC-MAC 

One of the most widely used MACs is based on CBC mode of DES with an initialization vector of zeros. The 

data are grouped into 64-bit blocks. If necessary, the final block is padded with zeros to the right to have 64 

bits. The code is produced as shown in Fig. 11.6. 

13.10 HMAC 


13.11 Basic uses of encryption, hash functions, and MACs 

We show in Figs. 11.1, 11.4, and 11.5 and Tables 11.1, 11.2, and 11.3 the basic ways to use encryption, MACs, 

and hash functions in order to achieve goals such as confidentiality, authentication, and signature. 

The notations used are described below: 

- M – message (plaintext) 

- E – encryption algorithm 

- D – decryption algorithm 

- C – MAC algorithm 

- H – hash algorithm 

- K (or K 1 ,K 2 ) – secret key 

- KU a – A’s public key


- KR a – A’s private key 

- KU b – B’s public key 

- KR b – B’s private key

CS4413a – Cryptography and Security – fall 2011 – c○ 2011 by Lucian Ilie 84


14 DIGITAL SIGNATURES AND AUTHENTICATION 

- a method of signing a message in electronic form 

- also called digital signatures 

14.1 Digital versus conventional signatures 

- attaching to the document 

- conventional signature – physically attached to a document 

- digital signature – is not physically attached 

- it must be somehow bound to the message 

- verifying 

- conventional – verified by comparison with others 

- digital – verified using a publicly known verification algorithm 

- to prevent forgeries 

- copying 

- conventional – a copy should be different from the original :-) 

- digital – a copy is perfectly identical 

- must prevent reuse – e.g., include the date in the message 

14.2 What is a signature scheme 

- two components 

- signing algorithm – secret – the message x is signed: sig(x) 

- verification algorithm – public – ver(x,y) – verifies the signature 

- signature scheme – (P,A,K,S,V) 

- P – messages 

- A – signatures 

- K – keys 

- S – signing algorithms 

- V – verification algorithms 

- for each K ∈ K, there are sig K ∈ S and ver K ∈ V 

- sig K : P → A – polynomial-time function, secret 

- ver K : P ×A → {true,false} – polynomial-time function, public 

- for every message x ∈ P and every signature y ∈ A: 

{ 

true if y = sig 

ver K (x,y) = 

K (x) 

false if y ≠ sig K (x) 

- goal – computationally infeasible for Oscar to forge Bob’s signature on a message x 

- unconditional security – impossible 

- given sufficient time, Oscar can test all possible y’s using the public ver until the right one is found 

14.3 RSA signature scheme 

RSA signature scheme 

P = A = Z n ; n = pq, p,q primes


K = {(n,p,q,a,b) | n = pq,p,q primes ,ab ≡ 1 (mod φ(n))}. 

public: n,b 

private: p,q,a 

signature: sig K (x) = x a mod n 

verification: ver K (x,y) = true iff x = y b mod n 

sig K = d K 

x = e K (y) 

- only Bob can sign messages since d K is secret 

- anyone can verify signatures since e K is public 

- forged signatures on random messages 

- Oscar can choose y and compute x = e K (y) 

- this means sig K (x) = y so y is a correct signature for x 

- problem: x is meaningless, with very high probability 

- combining signing and public-key encrypting 

- Alice wants to send a signed encrypted message x to Bob 

- Alice computes her signature: y = sig Alice (x) 

- Alice encrypts both x and y using Bob’s public key: z = e Bob (x,y) 

- Bob receives z and first decrypts it: d Bob (z) 

- Bob uses Alice’s public verification algorithm: ver Alice (x,y) = true 

- what if encryption comes first (before signing) 

- Alice computes: z = e Bob (x), y = sig Alice (z), and sends (z,y) 

- Bob computes: ver Alice (z,y) = true and x = d Bob (z) (in any order) 

- problem: Oscar can replace (z,y) by (z,y ′ = sig Oscar (z)) 

- Oscar can sign z without decrypting 

- Bob will infer that the message x originated with Oscar 

14.4 ElGamal signature scheme 

ElGamal signature scheme 

P = Z ∗ p ; A = Z∗ p ×Z p−1; p prime, α ∈ Z ∗ p primitive 

K = {(p,α,a,β) | β ≡ α a (mod p)}. 

public: p,α,β 

private: a 

signature: sig K (x,k) = (γ,δ) = (α k mod p,(x−aγ)k −1 mod (p−1)) 

- k ∈ Z ∗ p−1 is a secret random number 

verification: ver K (x,(γ,δ)) = true iff β γ γ δ ≡ α x (mod p) 

- correctness 

- we have by construction x ≡ aγ +kδ (mod p−1) 

- therefore β γ γ δ ≡ α aγ α kδ ≡ α x (mod p) 

- security 

- Oscar wants to compute a signature for a message x without knowing a 

- if he chooses γ, he has to compute δ = log γ α x β −γ 

- this is a discretelog problem 

- if he chooses δ, he has to compute γ from β γ γ δ ≡ α x (mod p) 

- no feasible solution known to this problem 

- it does not seem to be related to discretelog


- open problem – it might be possible to compute γ and δ simultaneously such that (γ,δ) is a signature 

- (useless) forgeries 

- Oscar can choose γ,δ,x simultaneously 

- assume 0 ≤ i ≤ p−2, 0 ≤ j ≤ p−2, gcd(j,p−1) = 1 

- Oscar chooses: 

γ = α i β j mod p 

δ = −γj −1 mod (p−1) 

x = −γij −1 mod (p−1) (j −1 is computed modulo p−1) 

- then (γ,δ) is a valid signature for x 

- Oscar begins with a message previously signed by Bob: (γ,δ) = sig Bob (x) 

- Oscar can sign other messages 

- assume 0 ≤ h,i,j ≤ p−2, gcd(hγ −jδ,p−1) = 1 

- Oscar computes: 

λ = γ h α i β j mod p 

µ = δλ(hγ −jδ) −1 mod (p−1) 

x ′ = λ(hx+iδ)(hγ −jδ) −1 mod (p−1) 

- then (λ,µ) is a valid signature for x ′ 

- these forgeries are no threats to the security as Oscar cannot sign a message of his own choosing 

- careless use of the scheme 

- k must not be revealed 

a = (x−kδ)γ −1 mod (p−1) – the system is broken 

- signing two messages with the same k 

- assume sig K (x 1 ) = (γ,δ 1 ) and sig K (x 2 ) = (γ,δ 2 ) 

- then 

α x1−x2 ≡ γ δ1−δ2 (mod p) 

- so, using γ = α k , 

- this gives 

- if d = gcd(δ 1 −δ 2 ,p−1) then d | (x 1 −x 2 ) 

- put 

- we have then 

- this gives 

- there are d candidates for k: 

α x1−x2 = α k(δ1−δ2) (mod p) 

x 1 −x 2 ≡ k(δ 1 −δ 2 ) (mod p−1) 

x ′ = x 1 −x 2 

d 

,δ ′ = δ 1 −δ 2 

,p ′ = p−1 

d d 

x ′ ≡ kδ ′ (mod p ′ ) 

k = x ′ (δ ′ ) −1 mod p ′ 

k = x ′ (δ ′ ) −1 +ip ′ mod (p−1),0 ≤ i ≤ d−1 

- the correct one comes from 

γ ≡ α k (mod p)


14.5 Schnorr signature scheme 

- idea: using two primes p ≈ 2 1024 and q ≈ 2 160 , sign message digests of size log 2 q using signatures of size 

2log 2 q such that the computations are done in Z p 

Schnorr signature scheme 

P = {0,1} ∗ ; A = Z q ×Z q ; p prime, q prime, q|p−1 

K = {(p,q,α,a,β) | β ≡ α a (mod p)}; α ∈ Z ∗ p qth root of 1 modulo p 

- α = α (p−1)/q 

0 mod p, for α 0 primitive 

public: p,q,α,β 

private: a 

signature: sig K (x,k) = (γ,δ) = (h(x‖α k ),k +aγ mod q) 

- h : {0,1} ∗ → Z q is a secure hash function 

- 1 ≤ k ≤ q −1 is a secret random number 

verification: ver K (x,(γ,δ)) = true iff h(x‖α δ β −γ ) = γ 

- correctness 

- it is easy to check that α δ β −γ ≡ α k (mod p) 

14.6 Digital Signature Algorithm (DSA) 

Digital Signature Algorithm (DSA) 

P = {0,1} ∗ ; A = Z q ×Z q ; p L-bit prime (512 ≤ L ≤ 1024,L≡ 0 (mod 64)), q 160-bit prime, q|p−1 

K = {(p,q,α,a,β) | β ≡ α a (mod p)}; α ∈ Z ∗ p qth root of 1 modulo p 

- α = α (p−1)/q 

0 mod p, for α 0 primitive 

public: p,q,α,β 

private: a 

signature: sig K (x,k) = (γ,δ) = ((α k mod p) mod q,(SHA-1(x)+aγ)k −1 mod q) 

- 1 ≤ k ≤ q −1 is a secret random number 

- if γ = 0 or δ = 0 then a new random k is chosen 

verification: ver K (x,(γ,δ)) = true iff (α e1 β e2 mod p) mod q = γ 

e 1 = SHA-1(x)δ −1 mod q 

e 2 = γδ −1 mod q 

- correctness 

- start with ElGamal signature sig K (x,k) = (γ,δ) = (α k mod p,(x−aγ)k −1 mod (p−1)) 

- change δ to δ = (x+aγ)k −1 mod (p−1) 

- verification becomes: α x β γ ≡ γ δ (mod p) 

- we can reduce all exponents modulo q: α x mod q β γ mod q ≡ γ δ mod q (mod p) 

- we can assume x is already reduced as it is a message digest 

- put δ = (x+aγ)k −1 mod q, γ ′ = γ mod q 

- verification is now: α x β γ′ ≡ γ δ (mod p) 

- raise both sides to δ −1 mod q: α xδ−1 β γ′ δ −1 mod p = γ 

- reduce modulo q: (α xδ−1 β γ′ δ −1 mod p) mod q = γ ′


15 KEY DISTRIBUTION AND KEY AGREEMENT 

- secret-key cipher – needs a secure channel to exchange a secret key 

- public-key cipher – needs no secure channel to exchange a secret key 

- public-key ciphers – much slower than secret-key ciphers (1500 times) 

- for long messages 

- encryption is done using secret-key ciphers 

- the secret keys are exchanged using public-key ciphers 

- key distribution – one party chooses a secret key and transmits it to other parties 

- key agreement – a protocol where several parties establish together a secret key over a public channel 

- setup 

- insecure network of n users 

- we might have – trusted authority (TA) 

- verifies identities 

- chooses keys 

- transmits keys 

- adversary (Oscar) 

- passive – eavesdropping 

- active 

- alter messages 

- save messages for later use 

- masquerade as various users 

- examples of Oscar’s potential goals: 

- to fool two users U and V into accepting an invalid key 

- to make U believe that he has exchanged a key with V when he actually has not 

- goal – U and V should have at the end of the protocol a secret key, unknown to anyone else (except possibly 

the TA) 

15.1 Key distribution 

- if each pair of users independently exchanges a secret key (over a secure channel) then: 

- ( n 

2) 

secure channels needed 

- ( n 

2) 

keys needed 

- each user must store n−1 keys 

- with TA 

- for each pair of users U, V, it chooses and transmits a key 

- n secure channels needed – instead of ( ) 

n 

2 

- ( n 

2) 

keys needed 

- each user must store n−1 keys 

K U,V = K V,U 

- still too many keys – of the order n 2 

- this is called the n 2 -problem 

- goal 

- to reduce the number of transmitted keys 

- to reduce the number of stored keys 

- still each pair of users should be able to compute independently the secret K U,V = K V,U


15.2 Blom key distribution scheme 

Blom Key Distribution Scheme 

- given: p public prime and, for each user U, r U ∈ Z p , public 

1. TA chooses secret random a,b,c ∈ Z p 

2. TA forms the polynomial f(x,y) = a+b(x+y)+cxy mod p 

3. TA transmits to each U: a U = a+br U mod p and b U = b+cr U mod p 

4. U has g U (x) = a U +b U x = f(x,r U ) mod p 

5. U and V communicate by using the common secret key 

computed by U and V as 

K U,V = K V,U = f(r U ,r V ) = a+b(r U +r V )+cr U r V 

g U (r V ) = f(r U ,r V ) = g V (r U ) 

mod p 

- TA transmits two elements to each user 

- n channels needed 

- 2n keys needed 

- each user must store two elements 

- security 

- unconditionally secure against any individual user 

- any coalition of two users can determine all keys 

- generalization 

- TA chooses f(x,y) = ∑ k k 

i=0∑ 

j=0 a ijx i y j mod p, a ij = a ji 

- this scheme is secure against any coalition of size k 

- is completely broken by any coalition of size k +1 

15.3 Diffie-Hellman key distribution scheme 

Diffie-Hellman Key Distribution Scheme 

- given: p public prime and α ∈ Z ∗ p a public primitive element 

- TA has secret sig TA and public ver TA 

- U has secret a U ≤ p−2, public b U = α aU mod p and certificate 

C(U) = (ID(U),b U ,sig TA (ID(U),b U )) 

- everything is hashed before signed 

1. V computes K U,V = α aUaV mod p = b aV 

U 

2. U computes K V,U = α aUaV mod p = b aU 

V 

mod p 

mod p 

- security 

- the certificate cannot be altered because of the signature of the TA 

- problem: given b U and b V , can Oscar compute K U,V without knowing a U and a V 

Diffie-Hellman Problem (diffie-hellman) 

- given: p prime, α ∈ Z ∗ p primitive, β,γ ∈ Z∗ p 

- compute: β log α γ mod p (= γ log α β mod p)


Theorem 15.1. Solving diffie-hellman is equivalent to breaking ElGamal cryptosystem. 

15.4 Kerberos 

- keys used for long time can be compromised 

- idea: new key every time a pair of users want to communicate (key freshness) 

- the users need not share secret keys 

- each user U will share a secret key K U with TA 

- Kerberos – secret-key based 

A session key in Kerberos 

- given: each user U shares a secret key K U with TA 

1. U ask TA for a session key to communicate with V 

2. TA chooses random session key K, timestamp T, and lifetime L 

3. TA sends to U 

m 1 = e KU (K,ID(V),T,L) m 2 = e KV (K,ID(U),T,L) 

4. U decrypts m 1 and computes K,T,L, and ID(V) 

5. U sends to V m 2 (from TA) and m 3 = e K (ID(U),T) 

6. V decrypts m 2 and then m 3 using K 

7. V verifies that the two T’s and ID(U)’s are the same 

8. V sends to U m 4 = e K (T +1) 

9. U decrypts m 4 and verifies T +1 

- m 1 and m 2 – for key security 

- m 3 and m 4 – for key confirmation 

- T and L – to prevent Oscar from storing old keys 

15.5 Diffie-Hellman key exchange scheme 

- without on-line key server 

Diffie-Hellman Key Exchange Scheme 


1. U chooses random a U ≤ p−2 

2. U sends α aU mod p to V 

3. V chooses random a V ≤ p−2 

4. V sends α aV mod p to U 

5. U computes K U,V = (α aV ) aU mod p = α aUaV mod p 

6. V computes K V,U = (α aU ) aV mod p = α aUaV mod p 

- Diffie-Hellman key exchange – the information transmitted: 

U 

α aU 

α aV 

−−−−−−−→ 

←−−−−−−− 

V


- intruder-in-the-middle attack 

U 

α aU 

−−−−−−−→ 

α a′ V 

←−−−−−−− Oscar 

α a′ U 

α aV 

−−−−−−−→ 

←−−−−−−− V 

- Oscar has two keys K Oscar,U = α aUa′ V mod p and KOscar,V = α a′ U aV mod p 

- Oscar can communicate with either of U and V 

- U and V cannot notice that they do not communicate with each other 

- U and V cannot communicate with each other as their keys are different 

15.6 Station-to-station protocol 

- idea: to avoid intruder-in-the-middle attack 

- the key-agreement protocol should authenticate also the identities of the parties 

- authenticated key agreement 

- uses certificates and signatures (of the TA and users) 

Station-to-station Protocol 



- each user U has secret sig U , public ver U , and a public certificate 

C(U) = (ID(U),ver U ,sig TA (ID(U),ver U )) 

1. U chooses a random a U ≤ p−2 

2. U computes and sends α aU mod p to V 

3. V chooses a random a V ≤ p−2 

4. V computes α aV mod p, K V,U = α aUaV mod p, and y V = sig V (α aV ,α aU ) 

5. V sends (C(V),α aV mod p,y V ) to U 

6. U computes K U,V = α aUaV mod p 

7. U verifies y V using ver V and C(V) using ver TA 

8. U computes y U = sig U (α aU ,α aV ) and sends C(U),y U ) to V 

9. V verifies y U using ver U and C(U) using ver TA 

- the information is transmitted as follows (three-pass protocol): 

U 

α aU 

−−−−−−−−−−−−−−−−−−−−−−−→ 

C(V),α aV ,sig V (α aV ,α aU ) 

←−−−−−−−−−−−−−−−−−−−−−−−− 

C(U),sig U (α aU ,α aV ) 

−−−−−−−−−−−−−−−−−−−−−−→ 

V 

- attempt of intruder-in-the-middle attack: 

- Oscar cannot compute sig V (α a′ V ,α 

a U 

) to send to U 

- Oscar cannot compute sig U (α a′ U ,α 

a V 

) to send to V 

U 

α aU 

−−−−−−−−−−−−−−−−−−−−−→ 

α a′ V ,sigV (α a′ V ,α 

a U 

) = 

←−−−−−−−−−−−−−−−−−−−−−− 

sig U (α aU ,α a′ V ) 

−−−−−−−−−−−−−−−−−−−−−−→ 

Oscar 

α a′ U 

−−−−−−−−−−−−−−−−−−−→ 

α aV ,sig V (α aV ,α a′ U ) 

←−−−−−−−−−−−−−−−−−−−− 

sig U (α a′ U ,α 

a V 

) = 

−−−−−−−−−−−−−−−−−−−→ 

V


15.7 MTI key agreement protocol 

- idea: without signatures of users 

(MTI = Matsumoto, Takashima, Imai) 

MTI Key Agreement Protocol 



- each user U has secret a U , public b U = α aU mod p, and public 

C(U) = (ID(U),b U ,sig TA (ID(U),b U )) 

1. U chooses a random r U ≤ p−2 

2. U computes s U = α rU mod p and sends (C(U),s U ) to V 

3. V chooses a random r V ≤ p−2 

4. V computes s V = α rV mod p and sends (C(V),s V ) to U 

5. U computes K U,V = s aU 

V brU V 

6. V computes K V,U = s aV 

U brV U 

mod p = αrUaV+rVaU mod 

mod p = αrUaV+rVaU mod 

n 

n 

- the information is transmitted as follows (two-pass protocol): 

U 

C(U), α rU 

−−−−−−−−−−−−→ 

C(V), α rV 

←−−−−−−−−−−−− V 

- attempt of intruder-in-the-middle attack: 

U 

C(U),α rU 

−−−−−−−−−−−−→ 

C(V),α r′ V 

←−−−−−−−−−−−− Oscar 

C(U),α r′ U 

−−−−−−−−−−−−→ 

C(V),α rV 

←−−−−−−−−−−−− V 

- U and V will compute different keys 

- U computes K 1 = α rUaV+r′ V aU 

- V computes K 2 = α r′ U aV+rVaU 

- neither of these can be computed by Oscar 

15.8 Self-certifying keys 

- idea: without certificates 

- the public key and the identity of the owner authenticate each other 

Girault Key Agreement Protocol 

- given: p,q,p 1 ,q 1 secret primes (known to TA), p = 2p 1 +1, q = 2q 1 +1 

- public n = pq 

- secret α ∈ Z ∗ n, ord(α) = 2p 1 q 1 

- each U has ID(U) 

1. TA chooses a public RSA encryption exponent e 

2. TA computes the secret decryption exponent d = e −1 mod φ(n) 

3. (each) U chooses a secret a U and sends a U and b U = α aU mod n to TA 

4. TA computes p U = (b U −ID(U)) d mod n and sends it to U 

(p U is called U’s self-certifying public key)


5. U chooses a random r U ≤ p−2 and computes s U = α rU mod p 

6. U sends (ID(U),p U ,s U ) to V 

7. V chooses a random r V ≤ p−2 and computes s V = α rV mod p 

8. V sends (ID(V),p V ,s V ) to U 

9. U computes K U,V = s aU 

V (pe V mod n = α +ID(V))rU rUaV+rVaU mod n 

10. V computes K V,U = s aV 

U (pe U mod n = α +ID(U))rV rUaV+rVaU mod n 

- notes 

- U needs TA to produce p U 

- b U = p e U +ID(U) mod n – can be computed from p U and ID(U) using only public information 

- comments 

- if Oscar produces some (faked) b ′ U without the cooperation of TA, then he cannot compute the keys 

- if Oscar tries intruder-in-the-middle 

- the information transmitted is: 

U 

ID(U), p U , s U = α rU mod n 

−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ 

ID(V), p V , s V = α rV mod n 

←−−−−−−−−−−−−−−−−−−−−−−−−−−−− 

V 

- attempt of intruder-in-the-middle 

U 

ID(U), p U , α rU mod n 

−−−−−−−−−−−−−−−−−−−−−−−→ 

ID(V), p ′ V , αr′ V mod n 

←−−−−−−−−−−−−−−−−−−−−−−− 

Oscar 

ID(U), p ′ U , αr′ U mod n 

−−−−−−−−−−−−−−−−−−−−−−−→ 

ID(V), p V , α rV mod n 

←−−−−−−−−−−−−−−−−−−−−−−− 

V 

- Oscar cannot choose first b ′ V because he cannot compute then p′ V = (b′ V −ID(V))d mod n 

- so Oscar chooses r ′ V and p′ V ; Oscar can compute b′ V which will correspond to some a′ V ; i.e., b′ V = αa′ V 

mod n but Oscar cannot compute a ′ V 

- U computes K 1 = α rUa′ V +r′ V aU mod n 

- V computes K 2 = α r′ U aV+rVa′ U mod n 

- Oscar cannot compute either one 

- one possible attack – if TA does not ask for both a U and b U 

- U is required to give to TA both a U and b U 

- TA does not need a U ; p U can be computed without it 

- if users are not required to send both, attacks are possible 

- Oscar chooses a fake a ′ U 

- Oscar computes b ′ U = αa′ U mod n to TA 

(Oscar needs p ′ U = (b′ U −ID(U))d mod n) 

- Oscar computes b ′ Oscar = b′ U −ID(U)+ID(W) 

- Oscar sends ID(Oscar) and b ′ Oscar to TA 

- TA issues the public key p ′ Oscar = (b′ Oscar −ID(Oscar))d mod n 

- now p ′ Oscar = p′ U – so Oscar obtained it 

- Oscar, as the intruder-in-the-middle, can now compute the common key with V because he knows a ′ U 

- so, Oscar can decrypt messages sent by V to U


16 CRYPTOGRAPHIC PROTOCOLS 

A cryptographic protocol constitutes an algorithm for communication between different parties, adversaries or 

not. The goal achieved is usually beyond the simple secrecy of message transmission. For instance, one party 

can sign a message without seeing it, a secret can be divided among several parties in such a way that the 

secret can be reconstructed only when joining the information of all parties (or a certain number of those), one 

party can convince another that he/she is in possession of some information without disclosing anything of the 

information itself. Protocols realizing such goals have changed our ideas about what is impossible when several 

parties, adversaries or not, are communicating with each other. 

16.1 Blind signatures 

- idea: Alice wants Bob to sing a message x without seeing it (Bob trusts Alice) 

- normally, Bob would compute his signature on x as x d mod n but now he cannot do it this way as he 

would see x 

Blind signature 

- given: RSA setup 

1. Alice chooses a random secret k,1 < k < n 

2. Alice “blinds” x by computing t = xk e mod n (t looks random to Bob) 

3. Bob signs t: t d ≡ (xk e ) d ≡ x d k ed ≡ x d k mod n 

4. Alice “unblinds” the signed x: s = t d k −1 ≡ x d mod n 

- analogy: Alice seals the message inside an envelope with a piece of carbon paper. Bob signs the outside of 

the envelope; the signature goes also on the message. Alice opens then the envelope and has Bob’ signature on 

the message. 

16.2 Secret sharing 

- idea: a secret key K is to be shared among w parties such that any t parties can discover K but any t −1 

cannot 

-example: the controlofnuclearweaponsinRussia; anytwopartiesamongthePresident, DefenceMinister, 

and Defence Ministry can control those but one only cannot 

- example: K opens a secret safe in a bank; any four tellers can open, one manager and two tellers can 

open, any two managers can open, and the president can open but nothing less can. 

We define a (t,w)-threshold scheme a method of sharing K among w parties such that any t can compute K, 

and any t−1 cannot. (A (4,w)-threshold scheme would solve the above safe problem.) We assume the parties 

are P i ,1 ≤ i ≤ w and that there is a trusted dealer D which gives any party its share. 

We give first a simple solution for the case t = w; this is called secret splitting. 

Secret splitting – (t,t)-threshold scheme 

- given: the secret key K; we assume K is a binary string of length l 

1. D chooses w−1 random binary strings s i ,1 ≤ i ≤ t−1, each of length l 

2. D gives s i to P i , 1 ≤ i ≤ t−1 

3. D gives P t the string s t = ⊕ t−1 

i=1 s i ⊕K 

- correctness: 

- all parties can join and xor their shares: ⊕ t 

i=1 s i = ⊕ t−1 

i=1 s i ⊕ ⊕ t−1 

i=1 s i ⊕K = K 

- if t−1 parties join their shares, then any l-bit string can be the value of the key 

Next we see a fully general scheme, due to Shamir.


Shamir’s (t,w)-threshold scheme 

- given: the secret key K as an integer number 

1. D chooses a prime number p ≥ w+1 

2. D chooses w different numbers x i ∈ Z ∗ p , 1 ≤ i ≤ w; these are public 

3. D chooses random secret numbers a i ∈ Z p , 1 ≤ i ≤ t−1 and forms the polynomial 

where a 0 = K 

4. D computes y i = a(x i ), 1 ≤ i ≤ w 

5. P i receives y i 

∑t−1 

a(x) = a j x j mod p, 

j=0 

Let us see that the above scheme works as intended. We show first that any t parties can find K. We 

consider, without loss of generality, the first t parties. Their shares allows them to solve the system 

⎛ 

1 x 1 x 2 1 ··· x t−1 ⎞⎛ 

⎞ ⎛ ⎞ 

1 a 0 y 1 

1 x 2 x 2 2 ··· x t−1 

2 

a 1 

⎜ 

⎝ 

. 

. 

. 

⎟⎜ 

⎟ 

. ⎠⎝ 

. ⎠ = y 2 

⎜ ⎟ 

⎝ . ⎠ 

1 x t x 2 t ··· x t−1 

t 

a t−1 y t 

The determinant of the system is (because the system has a Vandermonde matrix) 

∏ 

(x i −x j ) mod p 

1≤i


16.3 Zero-knowledge proofs 

In this section we focus the attention on the following challenging and fascinating problem. Assume that P 

(the Prover) knows some information which could be the proof of a long standing open problem, the prime 

factorization of an integer, a 3-coloring of a graph or simply a password or an identification number. P would 

like to convince V (the Verifier) that he is in possession of this information without revealing a bit of of the 

information. Moreover, we want that V not only does not learn something about the information; we want V 

to learn nothing whatsoever, that is, V is able to simulate the protocol without P. 

A simple protocol is the following. 

Zero-knowledge proof of factorization 

- given: an RSA integer n; P want to prove V he knows the factorization of n 

1. V chooses a random integer x and tells x 4 mod n to P 

2. P tells x 2 mod n to V 

V obtains no information because she can square x herself. On the other hand, extracting square roots 

is equivalent to factoring n. In step 2, P not only has to extract a square root of x 4 but the particular one 

among the four square roots which is a quadratic residue modulo n. Determining quadratic residuosity is also 

intractable without knowledge of the factors of n. 

Next we give a zero-knowledge proof of identity. A common problem with most identification techniques 

such as ID cards, credit cards, and computer passwords is that P proves his identity by revealing a word i(P) 

that is memorized or printed on a card. An adversary cooperating with a dishonest verifier can learn i(P) and 

thus can later use it to pretend to be P. 

An obvious solution to this problem is to use a zero-knowledge proof to convince V that P knows i(P) 

without revealing a single bit about it. 

In the protocol below, the existence of a trusted agency is assumed. The only purpose of the agency is to 

publish a modulus n which equals the product of two large primes p and q but to keep the two primes secret. 

After publishing, the agency may cease to exist. 

Zero-knowledge proof of identity 

- given: a modulus n = p,q, p,q large secret primes, p ≡ 3 (mod 4),q ≡ 3 (mod 4) 

- P’s secret identification i(P) consists of k numbers c 1 ,c 2 ,...,c k , 1 ≤ c j < n 

- P’s public identification pi(P) consists of k numbers d 1 ,d 2 ,...,d k , 1 ≤ d j < n, such that each d j satisfies 

one of the congruences 

d j c 2 j ≡ ±1 (mod n) 

1. P chooses a random number r, computes ±r 2 mod n and sends one of them, call it x, to V 

2. V chooses a subset S ⊆ {1,2,...,k} and tells it to P 

3. P tells V the number 

y = r ∏ j∈Sc j (mod n) 

4. V verifies the condition 

x ≡ ±y 2 ∏ j∈Sd j 

Observe that the verification in step 4 should hold because 

y 2 ∏ j∈S 

d j ≡ r 2( ∏ ) 2 ∏ 

c j d j ≡ ±r 2 ≡ ±x (mod n). 

j∈S 

j∈S 

The use of r is necessary because, otherwise, V would find out any c j by choosing S = {j}. The special form of 

p and q guarantees that the numbers d j can range over all integers with the Jacobi symbol +1 (mod n). This


implies that V can be sure that the numbers c j exist. A tacit assumption is that any c j is relatively prime with 

n, otherwise n can be factorized and the whole world collapses. 

The only way for P to cheat is to guess S in advance; the probability to do that is 2 −k and becomes 2 −kt 

when the protocol is repeated t times.


Contents 

1 INTRODUCTION 2 

1.1 Why do we need cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 

1.2 Goals of cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 

1.3 Definitions and notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 

1.4 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 

1.5 Symmetric-key encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 

1.6 Public-key encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 

2 SEVERAL CLASSICAL SYSTEMS 8 

2.1 Modular arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

2.2 The shift cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

2.3 The substitution cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

2.4 The affine cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 

2.5 The Vigenère cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 

2.6 The Hill cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 

2.7 The permutation cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 

2.8 Stream ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 

2.9 One-time pad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 

3 PERFECT SECRECY 15 

3.1 Probability theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 

3.2 Perfect secrecy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 

4 DATA ENCRYPTION STANDARD 18 

4.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 

4.2 Feistel ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 

4.3 Description of DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

4.4 Analysis of DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 

4.5 Modes of operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 

4.6 Triple DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 

5 LINEAR AND DIFFERENTIAL CRYPTANALYSIS 25 

5.1 Iterated ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

5.2 Substitution-permutation network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

5.3 Linear cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 

5.3.1 The piling-up lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 

5.4 Linear approximation of S-boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 

5.5 A linear attack on SPN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 

5.6 Complexity of attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 

5.7 Differential cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 

5.8 Applications to DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 

6 FINITE FIELDS 37 

6.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 

6.2 Modular arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 

6.3 Polynomial rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

6.4 The ring Z p [x] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 

6.5 Finite fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 

6.6 Motivation for using finite fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 

6.7 Computational considerations in F 2 n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41


7 ADVANCED ENCRYPTION STANDARD 43 

7.1 The new standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 

7.2 Description of AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 

7.3 Decryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 

8 MORE NUMBER THEORY 50 

8.1 Complexity of arithmetic operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 

8.2 The Chinese remainder theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 

8.3 The theorems of Fermat and Euler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 

8.4 Cyclic groups and primitive elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 

8.5 Discrete logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 

9 PUBLIC-KEY CRYPTOGRAPHY AND RSA 55 

9.1 The idea of public keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 

9.2 The RSA cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 

9.3 RSA security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 

9.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 

9.5 Fast modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 

9.6 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 

9.7 Randomized algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 

9.8 Primality tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 

9.9 Attacks on RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 

9.9.1 Decryption exponent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 

9.9.2 Wiener’s low decryption exponent attack . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 

9.9.3 Partial information about plaintext bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 

10 FACTORING ALGORITHMS 68 

10.1 Trial division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 

10.2 Pollard’s p−1 algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 

10.3 Pollard’s rho algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 

10.4 Random square factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 

10.5 Quadratic sieve algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 

10.6 The best current factoring algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 

10.7 Factoring RSA moduli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 

11 OTHER PUBLIC-KEY CRYPTOSYSTEMS 72 

11.1 Rabin cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 

11.2 Security of Rabin cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 

11.3 ElGamal cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 

12 ALGORITHMS FOR DISCRETE LOGARITHM 75 

12.1 Shank’s baby-step giant-step algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 

12.2 Pohlig-Hellman algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 

13 HASH FUNCTIONS AND MESSAGE AUTHENTICATION 77 

13.1 Data integrity and hash functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 

13.2 Properties of hash functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 

13.3 Security of hash functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 

13.4 Iterated hash functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 

13.5 MD5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 

13.6 SHA-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 

13.7 RIPEMD-160 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 

13.8 Message authentication codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 

13.9 CBC-MAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82


13.10HMAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 

13.11Basic uses of encryption, hash functions, and MACs . . . . . . . . . . . . . . . . . . . . . . . . . 82 

14 DIGITAL SIGNATURES AND AUTHENTICATION 85 

14.1 Digital versus conventional signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 

14.2 What is a signature scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 

14.3 RSA signature scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 

14.4 ElGamal signature scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 

14.5 Schnorr signature scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 

14.6 Digital Signature Algorithm (DSA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 

15 KEY DISTRIBUTION AND KEY AGREEMENT 89 

15.1 Key distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 

15.2 Blom key distribution scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 

15.3 Diffie-Hellman key distribution scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 

15.4 Kerberos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 

15.5 Diffie-Hellman key exchange scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 

15.6 Station-to-station protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 

15.7 MTI key agreement protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 

15.8 Self-certifying keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 

16 CRYPTOGRAPHIC PROTOCOLS 95 

16.1 Blind signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 

16.2 Secret sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 

16.3 Zero-knowledge proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

Cryptography and Security - Computer Science

Create successful ePaper yourself

Delete template?

Save as template?