21.03.2013 Views

A Tight Bound for EMAC - CWI

A Tight Bound for EMAC - CWI

A Tight Bound for EMAC - CWI

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

A <strong>Tight</strong> <strong>Bound</strong> <strong>for</strong> <strong>EMAC</strong><br />

Krzysztof Pietrzak ⋆<br />

Département d’In<strong>for</strong>matique, École Normale Supérieure, Paris<br />

pietrzak@di.ens.fr<br />

Abstract. We prove a new upper bound on the advantage of any adversary<br />

<strong>for</strong> distinguishing the encrypted CBC-MAC (<strong>EMAC</strong>) based on<br />

random permutations from a random function. Our proof uses techniques<br />

recently introduced in [BPR05], which again were inspired by [DGH + 04].<br />

The bound we prove is tight — in the sense that it matches the advantage<br />

of known attacks up to a constant factor — <strong>for</strong> a wide range of the<br />

parameters: let n denote the block-size, q the number of queries the<br />

adversary is allowed to make and ℓ an upper bound on the length (i.e.<br />

number of blocks) of the messages, then <strong>for</strong> ℓ ≤ 2 n/8 and q ≥ ℓ 2 the<br />

advantage is in the order of q 2 /2 n (and in particular independent of ℓ).<br />

This improves on the previous bound of q 2 ℓ Θ(1/ ln ln ℓ) /2 n from [BPR05]<br />

and matches the trivial attack (which thus is basically optimal) where<br />

one simply asks random queries until a collision is found.<br />

1 Introduction<br />

Cipher Block Chaining (CBC) is a popular mode of operation <strong>for</strong> block ciphers<br />

which is used (in some variations) <strong>for</strong> encryption and message authentication,<br />

i.e. as a Message Authentication Code (MAC).<br />

Some Definitions. The CBC function with key π : {0, 1} n →{0, 1} n , denoted<br />

CBCπ, takes as input a message (whose length must be a multiple of n) M =<br />

M1 ···Mm ∈ ({0, 1} n ) m and outputs Cm which is inductively computed as<br />

CBCπ(M) =Cm where C0 =0 n and Ci = π(Ci−1 ⊕ Mi) <strong>for</strong>i =1,...,m<br />

The ECBC function (E <strong>for</strong> encrypted) is derived from the CBC function by<br />

additionally encrypting the output with an independent permutation 1<br />

ECBCπ1,π2(M) def<br />

= π2(CBCπ1(M))<br />

CBC based MACs. The CBC and ECBC function, with the π’s instantiated<br />

by a block-cipher, are popular MACs called CBC-MAC and <strong>EMAC</strong> respectively.<br />

As <strong>for</strong> the CBC-MAC, two parties sharing a secret key K ∈K<strong>for</strong> a blockcipher<br />

E : K×{0, 1} n → {0, 1} n can authenticate their communication by<br />

⋆ Part of this work is supported by the Commission of the European Communities<br />

through the IST program under contract IST-2002-507932 ECRYPT.<br />

1 The ECBC function must not be confused with the ECBC-MAC from [BR00].


sending, together with their message M, the authentication tag CBCE(K,.)(M). On the Security of CBC based MACs. The CBC-MAC as just described<br />

is well known to be completely insecure in general, 2 but has been proven secure<br />

(under the assumption that the underlying block-cipher is a secure pseudorandom<br />

permutation) under the restriction that all messages have the same length<br />

in [BKR00], which then has been relaxed to the condition that no message is the<br />

prefix of another [PR00]. This means that the CBC-MAC can be safely used <strong>for</strong><br />

messages of different length, if some prefix free encoding is applied.<br />

The <strong>EMAC</strong> is a popular variant of the CBC-MAC which was developed by the<br />

RACE project [BP95], unlike the “plain” CBC-MAC it is secure without any<br />

restriction on the message space [PR00]. The <strong>EMAC</strong>, along with the UMAC,<br />

TTMAC and HMAC, is one of the message authentication codes recommended<br />

by NESSIE [NES].<br />

The Model. As nowadays usual, we analyse the security of the construction we<br />

are interested in (which is ECBCπ1,π2) in a setting where the underlying primitive<br />

(here π1,π2) are realized by their ideal functionality (here uni<strong>for</strong>mly random<br />

permutations), thus separating the analysis of the security of the construction<br />

from the security of the underlying primitive. 3 More precisely, we prove an upper<br />

bound on AdvECBC(q, n, ℓ), by which we denote probability of any adversary<br />

making q queries of length at most ℓ blocks, in (existentially) <strong>for</strong>ging ECBCπ1,π2.<br />

Following [BR00,BPR05], we view the <strong>EMAC</strong> as a Carter-Wegman MAC<br />

[CW79]. This reduces the task of bounding AdvECBC(q, n, ℓ) to the task of<br />

bounding the probability that there is a collision amongst the CBC-MACs of<br />

q messages of length at most ℓ blocks, we denote this probability by CPq,n,ℓ<br />

(see (5)). In practice one would instantiate the πi’s by a block-cipher (and not<br />

with uni<strong>for</strong>m random permutations). If this block-cipher is secure in the sense<br />

of being a good pseudorandom permutation, then the security of the <strong>EMAC</strong> is<br />

basically CPq,n,ℓ, thus proving a good bound on this probability translates into<br />

improved security guarantees <strong>for</strong> the <strong>EMAC</strong>.<br />

Known Lower <strong>Bound</strong>s. There is a trivial lower bound CPq,n,ℓ ∈ Ω(q2 /2n )<br />

<strong>for</strong> any q, n and ℓ>1 as by the birthday bound we can find a collision with<br />

probability Ω(q2 /2n ) <strong>for</strong> any input shrinking function by asking random queries. 4<br />

For q = 2 [BPR05] show a lower bound of CP2,n,ℓ ∈ Ω(d(ℓ)/2n )where<br />

d(ℓ) def<br />

= maxt≤ℓ |{x;1≤ x ≤ 2n ,x|t}| denotes the maximum number of divisors<br />

between 1 and 2n of any number ≤ ℓ. It is known (Theorem 317 in [HW80])<br />

that D(ℓ) def<br />

= maxt≤ℓ |{x; x|t}| ∈ ℓΘ(1/ ln ln ℓ) , so the same bound applies <strong>for</strong> d(ℓ)<br />

if ℓ ≤ 2n as then d(ℓ) =D(ℓ).<br />

2 In particular, it is not existentially un<strong>for</strong>geable as shown by the following simple<br />

attack: <strong>for</strong> any X ∈{0, 1} n ,requesttheMACC = CBCπ(X) =π(X), and output<br />

a message XX ⊕ C with tag C. This is a successful <strong>for</strong>gery as CBCπ(XX ⊕ C) =<br />

π(π(X) ⊕ X ⊕ C) =π(X) =C.<br />

3 See e.g. [Mau02] <strong>for</strong> more detailed discussion of this concept.<br />

4 For ℓ =1wehaveCPq,n,ℓ = 0 as a permutation does not have collisions.


Upper <strong>Bound</strong>, Range Other<br />

Where O(.) of restriction restrictions<br />

[PR00] ℓ 2 q 2 /2 n<br />

- -<br />

[BPR05] d(ℓ)q 2 /2 n<br />

ℓ ∈ O(2 n/4 ) -<br />

[DGH + 04] q 2 /2 n<br />

ℓ ∈ O(2 n/3 ) Equal length messages<br />

Here q 2 /2 n<br />

ℓ ∈ O(2 n/8 ),q ∈ Ω(ℓ 2 ) -<br />

Lower <strong>Bound</strong>,<br />

Where Ω(.) of<br />

Folklore (birthday bound) q 2 /2 n<br />

[BPR05] d(ℓ)q/2 n<br />

Fig. 1. Upper and lower bounds <strong>for</strong> CPq,n,ℓ (which then imply basically the same<br />

bounds <strong>for</strong> AdvECBC(q, n, ℓ)).<br />

Known Upper <strong>Bound</strong>s. Until now the best known upper bound was a CPn,q,ℓ ∈<br />

O(d(ℓ)q2 /2n )(<strong>for</strong>ℓ≤2n/4 ) due to Bellare et al. [BPR05], this bound improved<br />

on the O(ℓ2q2 /2n ) bound of Petrank and Rackoff [PR00]).<br />

<strong>Tight</strong> <strong>Bound</strong> <strong>for</strong> Equal Length. Dodis et al. [DGH + 04] investigated a<br />

restricted case where the messages have same length (which is uninteresting <strong>for</strong><br />

the <strong>EMAC</strong> construction, but this was not their goal), they state a tight collision<br />

probability of CP2,n,ℓ ∈ O(q2 /2n )(<strong>for</strong>ℓ≤2n/3 ) <strong>for</strong> the CBC-MAC of two<br />

messages, which immediately gives an optimal CPq,n,ℓ ∈ Θ(q2 /2n ) bound <strong>for</strong><br />

the collision probability of q equal length messages.<br />

Our Contribution. In this paper we prove the optimal bound CPq,n,ℓ ∈<br />

Θ(q2 /2n )<strong>for</strong>q≥ℓ2 and ℓ ≤ 2n/8 . So <strong>for</strong> this range the security of ECBC (and<br />

thus the <strong>EMAC</strong>) matches the security of an ideal MAC (i.e. the birthday bound)<br />

up to constant factors.<br />

The technique from [BPR05]. Both, the “classical” O(q2ℓ2 /2n )[PR00]and<br />

the O(d(ℓ)q2 /2n ) upper bound [BPR05] are achieved by first proving an upper<br />

bound on CP2,n,ℓ, the collision probability of two messages, and then applying<br />

the union bound<br />

q(q − 1)<br />

CPq,n,ℓ ≤ · CP2,n,ℓ<br />

(1)<br />

2<br />

to get a bound <strong>for</strong> CPq,n,ℓ. In particular [BPR05] prove that<br />

CP2,n,ℓ ≤ 2d(ℓ)/2 n +64ℓ 4 /2 2n . (2)<br />

This bound is tight up to the higher order term and a factor 2:<br />

CP2,n,ℓ ≥ d(ℓ)/2 n<br />

The proof of (2) uses ideas from [DGH + 04,Dod05] and goes roughly as follows:<br />

For any two messages M1,M2 and a permutation π one maps the computation<br />

of CBCπ(M1) andCBCπ(M2) to a graph (called structure graph) consisting of<br />

two paths associated with the message M1 and M2 respectively. In this graph<br />

the vertices correspond to the outputs of π during this computation.<br />

(3)


Each such graph contains zero or more accidents, by which one denotes the<br />

“unexpected” collisions in the graph. The main technical lemma (Lemma 2 in<br />

this paper) now states that the probability (over the choice of π) thatsome<br />

particular structure graph G will appear is exponentially small in the number<br />

of accidents of G. From this lemma one gets that the probability that a random<br />

structure graph has at least one accident is in O(ℓ 2 /2 n ). We can now use that<br />

CBCπ(M1) =CBCπ(M2) implies that there must be at least one accident to get a<br />

O(ℓ 2 /2 n ) upper bound on CP2,n,ℓ, and further with (1) the “classical” CPq,n,ℓ ∈<br />

O(q 2 ℓ 2 /2 n ) bound. But this bound is not tight as having an accident is only<br />

necessary, but not sufficient to have CBCπ(M1) =CBCπ(M2). By more carefully<br />

upper bounding the number of graphs <strong>for</strong> which CBCπ(M1) =CBCπ(M2) by<br />

O(d(ℓ)/2 n +ℓ 4 /2 2n ) one gets the (2) bound. Here the O(d(ℓ)/2 n ) term bounds the<br />

graphs which have exactly one accident and CBCπ(M1) =CBCπ(M2), whereas<br />

all graphs with two or more accidents are “generously” bounded by the “higher<br />

order” term O(ℓ 4 /2 2n ), whech will be dominated by the leading d(ℓ)/2 n while ℓ<br />

is not too large, ℓ ∈ O(2 n/4 ) is small enough.<br />

Un<strong>for</strong>tunately the bound (3) implies that bounding the collision probability<br />

<strong>for</strong> two messages and then using (1) one cannot prove CPq,n,ℓ ∈ o(d(ℓ)q 2 /2 n ).<br />

Proof Idea. The obvious idea to overcome this barrier is to upper bound the<br />

number of structure graphs built by many (and not just two) messages. We prove<br />

a lemma (Lemma 4) which states that the number of structure graphs built from<br />

any k messages of length at most ℓ blocks, having exactly one accident and a<br />

collision on the output <strong>for</strong> some pair of messages, is at most k(k + ℓ 2 ), this then<br />

gives the claimed CPq,n,ℓ ∈ O(q 2 /ℓ 2 ) bound. Un<strong>for</strong>tunately now the graph is so<br />

big (i.e. qℓ vertices) that the higher order term which bounds the cases where we<br />

have two or more accidents is in the order q 4 ℓ 4 /2 2n (so unless we assume some<br />

bound o(2 n/2 )onq, we only achieve a tight O(q 2 /2 n )<strong>for</strong>constantℓ, but this is<br />

already achieved by the classical q 2 ℓ 2 /2 n of [PR00]).<br />

Fortunately one can get out of this apparent cul-de-sac using an approach<br />

“between” the one just described and the one given by (1). The q messages are<br />

divided into q/ℓ 2 sets of size r = ℓ 2 . Now, if there’s a collision, then this collision<br />

occurs in the union of two (or maybe just one) such sets. For such a union of two<br />

sets (of size 2r) we can now upper bound the probability that there’s a collision<br />

amongst any two of the 2r messages by O(r 2 /2 n ) as the sets are sufficiently large<br />

(such that applying the be<strong>for</strong>e-mentioned Lemma 4 gives a 2r(2r + ℓ 2 )=Θ(r 2 )<br />

upper bound on the number of structure graphs), but still small enough <strong>for</strong> the<br />

higher order term to be ignored <strong>for</strong> a reasonable range of ℓ. Finally we get our<br />

CPq,n,ℓ ∈ O(q 2 /2 n ) bound (<strong>for</strong> ℓ ≤ 2 n/8 ) from the union bound applied over all<br />

pairs of sets.<br />

About the Range. The tight upper bound CPq,n,ℓ ∈ O(q 2 /2 n )weproveholds<br />

<strong>for</strong> q ∈ Ω(ℓ 2 )andℓ ∈ O(2 n/8 ). In the next two paragraphs we’ll shortly discuss<br />

those two bounds.


Lower <strong>Bound</strong> on q. The CP2,n,ℓ = Θ(d(ℓ)/2 n ) bound implies (under a reasonable<br />

assumption 5 ) CPq,n,ℓ = Ω(d(ℓ)q/2 n ). Thus CPq,n,ℓ ∈ O(q 2 /2 n ) can only<br />

hold if we have a lower bound <strong>for</strong> q of at least Ω(d(ℓ)), the bound we actually<br />

require is q ∈ Ω(ℓ 2 ). 6 But this lower bound on q is not really relevant as long<br />

as there’s a upper bound ≪ 2 n/2 on ℓ, as it only means that we don’t match<br />

the birthday bound O(q 2 /2 n ) <strong>for</strong> a range of parameters, where the collision<br />

probability given by the classical q 2 ℓ 2 /2 n bound is extremely small anyway.<br />

Upper <strong>Bound</strong> on ℓ. Wlog. we can assume an upper bound ℓ ≤ 2 n !asconsidering<br />

longer messages makes no sense: note that every x, 1 ≤ x ≤ 2 n divides 2 n !<br />

and thus CP2,n,2 n ! ≥ d(2 n !)/2 n = 1, i.e. we can find a collision with probability<br />

one with only two queries. 7 This (doubly exponential) bound is far from the<br />

ℓ ≤ 2 n/8 we require, and can probably be relaxed already with the techniques<br />

used in this paper. One possibility would be via a better counting argument,<br />

which means improving on Lemma 4 from this paper (in particular, Claim 2<br />

from the proof of this lemma seems quite loose). Lowering the O(q(q + ℓ 2 ))<br />

bound on the number of graphs given by the lemma to q(q + o(ℓ)) would already<br />

allow a range of ℓ ≤ 2 n/(4+o(1)) . Further, counting graphs with more than just<br />

one (but still constantly many) accidents could have the potential to get the<br />

bound to ℓ ≤ 2 n/(2+ɛ) <strong>for</strong> any ɛ>0. Such a bound might still be far from the<br />

necessary one, but would be sufficient <strong>for</strong> any practical application as a length<br />

of 2 n/2 is quite big already <strong>for</strong> small block lengths (say n = 128 which is the<br />

smallest block-length provided by AES).<br />

2 Definitions and the Main Technical Lemma<br />

Notation If x is a string then |x| denotes its length. We let Bn<br />

def<br />

= {0, 1} n .<br />

If X ⊆{0, 1} ∗ then X ≤m denotes the set of all non-empty strings <strong>for</strong>med by<br />

concatenating m or fewer strings from X. IfS is a set equipped with some<br />

probability distribution then s $<br />

← S denotes the operation of picking s from S<br />

according to this distribution. If no distribution is explicitly specified, it is understood<br />

to be uni<strong>for</strong>m. We denote by Perm(n) the set of all permutations over<br />

{0, 1} n and with Func(n) the set of all functions {0, 1} ∗ →{0, 1} n .<br />

Security. An adversary is a computationally unbounded, randomised oraclealgorithm<br />

which finally outputs a bit. Aq,n,ℓ denotes the class of adversaries that<br />

make at most q oracle queries, each of length at most ℓn-bit blocks. For a family<br />

of functions F : B∗ n →{0, 1}n , the distinguishing advantage of Aq,n,ℓ <strong>for</strong> F is<br />

AdvF (q, n, ℓ) = max { AdvF (A) } where<br />

A∈Aq,n,ℓ 5 We must assume that one can generate q/2 pairs of messages where each pair achieves<br />

the “worst case” collision probability Ω(d(ℓ)/2 n ), and moreover the events that any<br />

pair of messages collides are sufficiently independent.<br />

6 As both, the lower d(ℓ) and the upper ℓ 2 bound follow by rather loose arguments,<br />

the truth is probably strictly in-between, i.e. in ω(d(ℓ)) and o(ℓ 2 ).<br />

7 In fact, with ℓ =2 n ! we can <strong>for</strong>ge a message in a no-query attack as <strong>for</strong> any X ∈ Bn<br />

and π ∈ Perm(n) one has CBCπ(X 2n ! )=X.


AdvF (A) =Pr[f $<br />

← F : A f ⇒ 1] − Pr[f $<br />

← Func(n): A f ⇒ 1]<br />

Cbc and Ecbc. Fix n ≥ 1. Recall that <strong>for</strong> M = M 1 ···M m ∈ B m n and π: Bn →<br />

Bn we defined in the introduction<br />

CBCπ(M) =Cm where C0 =0 n and Ci = π(Ci−1 ⊕ Mi) <strong>for</strong>i =1,...,m<br />

Let CBC = {CBCπ: π ∈ Perm(n)}, this set of functions has the distribution<br />

induced by picking π uni<strong>for</strong>mly from Perm(n). The encrypted CBC MAC is<br />

ECBCπ1,π2(M) def<br />

= π2(CBCπ1(M))<br />

Let ECBC = {ECBCπ1,π2: π1,π2 ∈ Perm(n)}, with the distribution induced by<br />

picking π1,π2 independently and uni<strong>for</strong>mly at random from Perm(n).<br />

Collisions. For q distinct messages M1,...,Mq ∈ B∗ n we denote by<br />

CPn(M1,...,Mq) Pr<br />

π $ ← Perm(n)<br />

[∃i, j, Mi = Mj : CBCπ(Mi) =CBCπ(Mj)]<br />

the probability that the CBC-MACs (based on a uni<strong>for</strong>m random permutation)<br />

of any two messages collide. The maximum collision probability <strong>for</strong> any q messages<br />

of length at most ℓn-bit blocks is denoted by<br />

CPq,n,ℓ = max<br />

M1,...,Mq∈B ≤ℓ<br />

n<br />

CPn(M1,...,Mq) (4)<br />

Following [BR00], we view ECBC as an instance of the Carter-Wegman paradigm<br />

[CW79]. This enables us to reduce the problem of bounding Adv ECBC (q, n, ℓ) to<br />

bounding the collision probability CPq,n,ℓ as<br />

Adv ECBC (q, n, ℓ) ≤ CPq,n,ℓ + q(q − 1)/2 n+1<br />

We prove the following bound on CPq,n,ℓ.<br />

Lemma 1. For any q ≥ ℓ 2 : CPq,n,ℓ ≤ 16 · q 2 /2 n + 128 · q 2 ℓ 8 /2 2n<br />

From this lemma and (5) we get that Adv ECBC (q, n, ℓ) ∈ O(q 2 /2 n ) whenever<br />

q ∈ Ω(ℓ 2 )andℓ ∈ O(2 n/8 ), <strong>for</strong> example<br />

Corollary 1 For any q ≥ ℓ 2 and ℓ ≤ 2 n/8−1 : Adv ECBC (q, n, ℓ) ≤ 18 · q 2 /2 n<br />

3 A Graph-Based Representation of CBC<br />

In this section we review the graph-based approach to bound collision probabilities<br />

from (the full version of) [BPR05]. In this approach the collision probability<br />

is related to the number of graphs satisfying some property.<br />

We fix <strong>for</strong> the rest of this section a blocklength n ≥ 1, the number of messages<br />

t ≥ 1andtdistinct messages M def<br />

= {M1,...,Mt}, where<strong>for</strong>1≤i≤twedenote def<br />

with mi ≥ 1 the length (in blocks) of the i’th message Mi = M 1 i<br />

···M mi<br />

i<br />

(5)<br />

∈ Bmi<br />

n .


C C<br />

C<br />

C<br />

C<br />

D<br />

C<br />

D<br />

C<br />

C<br />

C<br />

C<br />

C<br />

C<br />

D<br />

D<br />

C<br />

B<br />

B<br />

B<br />

D<br />

E<br />

B<br />

B<br />

B<br />

B<br />

E E<br />

E<br />

E C<br />

E<br />

E<br />

A<br />

A<br />

A<br />

A<br />

D<br />

D<br />

D<br />

D<br />

A<br />

A<br />

A D<br />

D<br />

D<br />

G1,3 G ′ 1,3<br />

G1,2 G2,3 G G ′ G ′′<br />

V 0 1 = V 0 2 = V 0 V<br />

3<br />

1 1 = V 1 V<br />

3<br />

2 1 = V 2 3 = V 4 V<br />

3<br />

3 3<br />

V 1 V<br />

2<br />

2 V<br />

2<br />

3 2<br />

B<br />

00 11<br />

00 11<br />

00 11<br />

000 111<br />

000 111<br />

000 111<br />

000 111<br />

A<br />

C<br />

00 11<br />

00 11<br />

00 11<br />

00 11<br />

00 11<br />

00 11<br />

00 11<br />

00 11<br />

00 11<br />

00 11<br />

00 11<br />

Fig. 2. Gcol(M) = {G1,3,G ′ 1,3,G1,2,G2,3} are all structure graphs <strong>for</strong> M =<br />

{AB,DEC,ABCC} which have exactly one accident and a collision on the outputs.<br />

Further {G, G ′ ,G ′′ }∈G(M) \Gcol(M) are valid structure graphs but not in Gcol(M)<br />

as: G has 0 and G ′′ has 2 accidents. G ′ has exactly one accident but no collision on the<br />

outputs. H is not a structure graph as there’s a vertex which has two ingoing edges,<br />

both labelled C but not being parallel.<br />

For 1 ≤ j ≤ t let mj = j i=1 mi be the length of the first j messages. It<br />

0 def<br />

is convenient to set m = 0 and m def<br />

= mt to be the total length. Let M def<br />

=<br />

M1M2···Mt denote the concatenation of all messages and M i the i’th block<br />

of M, i.e. M def<br />

= M 1 ···M m .<br />

Structure graphs. To M and any π ∈ Perm(n) weassociatethestructure<br />

graph GM π , which is a directed graph (V,E) whereV⊆ [0,...,m].<br />

The structure graph GM π = G =(V,E) is defined as follows: We set<br />

C0 =0n and <strong>for</strong> i =1,...,m we define<br />

<br />

π(Ci−1 ⊕ M<br />

Ci =<br />

i ) if i/∈ [m0 +1,...,mt−1 +1]<br />

π(M i ) otherwise<br />

From this Ci’s we define the mapping [.]G :[0,...,m] → [0,...,m]as<br />

[i]G = min{j : Cj = Ci}. It is convenient to define a mapping [.] ′ G<br />

as [i] ′ G =[i]G if i /∈ [m0 ,...,mt−1 ]and[i] ′ G = 0 otherwise. Now the<br />

structure graph GM π = G =(V,E) isgivenby<br />

V = {[i]G :0≤ i ≤ m} E = {([i − 1] ′ G , [i]G) :1≤ i ≤ m}<br />

From this definition it is clear that the mapping [.]G defines G uniquely and vice<br />

versa. Throughout the “i’th edge of G” refers to the edge ([i − 1] ′ G , [i]G) (note<br />

that this not injective) and the “label” of the i’th edge is M i .<br />

If the Ci’s are all distinct, then G is simply a star like tree with t paths<br />

leaving the root 0, the i’th path being 0 → mi−1 +1→ ...→ mi−1 + mi = mi .<br />

In general G is the graph one gets by starting with the tree just described and<br />

doing the following while possible: if there are two vertices i, j where i = j and<br />

Ci = Cj then collapse i and j into one vertex and label it min{i, j}.<br />

For a structure graph G we will denote the vertices on the path built by<br />

the i’th message by V 0<br />

i (G),V1 i (G),...,Vmi i (G), we call this path the i-path, we<br />

write V j<br />

i <strong>for</strong> V j<br />

i (G) ifGis understood (cf. Figure 2).<br />

Let G(M) ={Gπ : π ∈ Perm(n)} denote the set of all structure graphs<br />

associated to messages M. This set has the probability distribution induced by<br />

picking π at random from Perm(n).<br />

00 11<br />

00 11<br />

00 11<br />

H<br />

D<br />

C<br />

00 11<br />

00 11<br />

00 11<br />

D<br />

E


Collisions. Suppose a structure graph G = GM π ∈G(M) is exposed edge by<br />

edge (i.e. in step i the value [i]G is shown to us). We say that G has a collision<br />

in step i if the edge exposed in step i points to a vertex which is already in the<br />

graph. With Col(G) we denote all collisions, i.e. all pairs (i, j) whereinstepi<br />

there was a collision which hit the vertex computed in step j


e an accident. And if we only consider structure graphs built by at most two<br />

paths, then also the second true collision is necessarily an accident (see Lemma<br />

10 in the full version of [BPR05], fact (i) below follows from this).<br />

For G ∈G(M) andi, j, 1≤ i


Lemma 4. Let n, q ≥ 1 and 1 ≤ m1,...,mq ≤ ℓ and M = {M1,...,Mq} with<br />

Mi ∈ B mi<br />

n be distinct messages, then<br />

|G 1 col(M)| ≤q(q + ℓ + ℓ 2 )/2<br />

Now combining (6)-(8) and the above Lemma we get:<br />

Lemma 5. CPq,n,ℓ ≤ q(q+ℓ+ℓ2 )<br />

2(2n−ℓq) + 4q4ℓ 4<br />

22n This already gives CPq,n,ℓ ∈ O(q 2 /2 n )<strong>for</strong>q 2 ℓ 4 ∈ O(2 n )andq ∈ Ω(ℓ 2 ). But<br />

we can do better. The reason why this bound is not so great is that the term<br />

which bounds the “two or more” accident case is of rather large order q 4 ℓ 4 /2 n<br />

as we consider a graph (i.e. a total message length) of size qℓ. Weachievethe<br />

bound claimed by Lemma 1 by splitting the messages in chunks of size ℓ 2 (with<br />

<strong>for</strong>esight) and then applying the following lemma which is a generalisation of (1).<br />

Lemma 6. If r divides q then CPq,n,ℓ ≤ CP2r,n,ℓ · q(q−r)<br />

2·r 2<br />

Proof. Consider q messages M1,...,Mq where CPq,n,ℓ = CPn(M1,...,Mq).<br />

We split the q messages into q/r sets S1,...,Sq/r, each containing r messages.<br />

If two messages collide, then there are two sets containing this two messages, so<br />

using the union bound CPn(M1,...,Mq) ≤ <br />

i,j,1≤i 1 as otherwise the above is ≥ 1<br />

which is a trivial upper bound <strong>for</strong> CPq,n,ℓ. Wealsohaveq


We will prove two claims, which then will imply the statement of the lemma.<br />

The first claim — which is basically Lemma 19 from [BPR05] — states that if<br />

i ∈ Pj, then there’s at most one structure graph with exactly one accident where<br />

Mi and Mj collide.<br />

The second claim bounds the number of structure graphs having one accident<br />

and a collision between Mj and any other message Mi where i ∈Pj by ℓ(ℓ+1)/2<br />

(note that this bound only depends on the length, but not on the number of<br />

messages considered). To prove this claim we use the simple observation that if<br />

there’s a collision between Mj and any Mi where i ∈Pj, then it must be the case<br />

that the j-path makes a loop. So we can upper bound the number of structure<br />

graphs having such a collision by the number of structure graphs having where<br />

the j-path loops.<br />

P<br />

Claim 1 For each i ∈ Pj, |Gi,j| ≤1.<br />

S<br />

Fig. 3. Figure <strong>for</strong> proof of Claim 1<br />

Proof (of Claim). Let P denote the common prefix and S the common suffix of<br />

Mi and Mj. SoMi = P M ′ i S and Mj = P M ′ j S where the M ′ i and M ′ j are<br />

nonempty as i ∈ Pj. Letp = |P |/n, s = |S|/n.<br />

By definition G ∈Gi,j means V mi mj<br />

i = Vj , this implies that also V mi−s<br />

i =<br />

(as <strong>for</strong> the last s steps the i and j path must go in parallel). Now as<br />

M mi−s−1<br />

i = M mj−s−1<br />

j (otherwise we could extend the suffix) we have V mi−s−1<br />

i =<br />

(because in a structure graph two edges with distinct labels cannot be<br />

parallel). So there’s a true collision in G [i,j] which hits the vertex V mi−s<br />

i .<br />

V mj−s<br />

j<br />

V mj−s−1<br />

j<br />

As by fact (i) 9 there can be only one true collision in G [i,j] this means that<br />

the “suffix path” V mi−s<br />

i<br />

= V mj−s<br />

j<br />

same reason the “prefix path” V 1<br />

i<br />

→ ... → V mi<br />

i<br />

= V 1<br />

j<br />

mj<br />

= Vj has no loops. For the<br />

p p<br />

→ ... → V = V makes no loop and<br />

also the prefix and suffix paths must be disjoint. So the subgraph of G [i,j] built<br />

by the first p +1 and the last s +1edges of the i and j path looks like shown on<br />

the left in Figure 3. There’s only way to extend this subgraph to the full G [i,j]<br />

without introducing more true collisions, this is the second graph in Figure 3.<br />

So there’s only one possible G [i,j], and by fact (ii) it uniquely determines the<br />

whole structure graph, thus there’s just one G ∈Gi,j. △<br />

Claim 2<br />

<br />

<br />

<br />

<br />

≤ ℓ(ℓ +1)/2.<br />

i∈P Gi,j<br />

9 We refer to the facts stated at the end of Section 3.<br />

i<br />

j


Proof (of Claim). Consider any i ∈P,andletS denote the common suffix of Mi<br />

and Mj. Now,asi ∈P,<strong>for</strong>someP we can write Mj = P M ′ j S and Mi = P S.<br />

Let p = |P |/n and s = |S|/n.<br />

Consider any G ∈Gi,j, by definition V mi<br />

i<br />

V mj−s<br />

j<br />

mj<br />

= Vj , which implies V mi−s<br />

i =<br />

as the last s blocks are equal. And as the first p blocks are equal we<br />

have V p p<br />

p mi−s<br />

i = Vj .NowVi = Vi and thus also V p mj−s<br />

j = Vj ,asp

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!