15.07.2013 Views

Thesis

Thesis

Thesis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Budapest University of Technology and Economics<br />

Department of Telecommunications<br />

Privacy enhancing protocols<br />

for wireless networks<br />

Ph.D. Dissertation<br />

of<br />

Tamás Holczer<br />

Supervisor:<br />

Levente Buttyán, Ph.D.<br />

TO BE ON THE SAFE SIDE<br />

Budapest, Hungary<br />

2012


Alulírott Holczer Tamás kijelentem, hogy ezt a doktori értekezést magam készítettem<br />

és abban csak a megadott forrásokat használtam fel. Minden olyan részt, amelyet szó szerint,<br />

vagy azonos tartalomban, de átfogalmazva más forrásból átvettem, egyértelműen,<br />

a forrás megadásával megjelöltem.<br />

I, the undersigned Tamás Holczer hereby declare, that this Ph.D. dissertation was<br />

made by myself, and I only used the sources given at the end. Every part that was<br />

quoted word-for-word, or was taken over with the same content, I noted explicitly by<br />

giving the reference of the source.<br />

A dolgozat bírálatai és a védésről készült jegyzőkönyv a Budapesti Műszaki és<br />

Gazdaságtudományi Egyetem Villamosmérnöki és Informatikai Karának dékáni hivatalában<br />

elérhetőek.<br />

The reviews of the dissertation and the report of the thesis discussion are available<br />

at the Dean’s Office of the Faculty of Electrical Engineering and Informatics of the<br />

Budapest University of Technology and Economics.<br />

Budapest, . . . . . . . . . . . . . . . . . . . . . . . .<br />

iii<br />

Holczer Tamás


Abstract<br />

Wireless networks are used in our everyday life. We use wireless networks to call each other, to<br />

download our emails at home, or to enter a building with a proximity card. In the near future<br />

wireless networks will be used in many new fields such as vehicular ad hoc networks, or critical<br />

infrastructure protection.<br />

The use of wireless networks instead of wired networks opens up new research challenges. These<br />

challenges include mobility, coping with unreliable links, resource constraints, and the security and<br />

privacy aspects of the wireless networks. In this thesis some privacy aspects of different wireless<br />

networks are investigated.<br />

In chapter 2, private authentication methods are proposed and analyzed for radio frequency<br />

identification (RFID) systems. A typical example for such an application is a Radio Frequency<br />

Identification System (RFID) system, where the provers are low-cost RFID tags, and the number<br />

of the tags can potentially be very large. I study the problem of private authentication in RFID<br />

systems. More specifically I propose two methods, that are the privacy efficient key-tree based<br />

authentication, and the group based authentication.<br />

The first key-tree based private authentication protocol has been proposed by Molnar and<br />

Wagner as a neat way to efficiently solve the problem of privacy preserving authentication based on<br />

symmetric key cryptography. However, in the key-tree based approach, the level of privacy provided<br />

by the system to its members may decrease considerably if some members are compromised. In this<br />

thesis, I analyze this problem, and show that careful design of the tree can help to minimize this<br />

loss of privacy. First, I introduce a benchmark metric for measuring the resistance of the system<br />

to a single compromised member. This metric is based on the well-known concept of anonymity<br />

sets. Then, I show how the parameters of the key-tree should be chosen in order to maximize the<br />

system’s resistance to single member compromise under some constraints on the authentication<br />

delay. In the general case, when any member can be compromised, I give a lower bound on the<br />

level of privacy provided by the system. I also present some simulation results that show that this<br />

lower bound is quite sharp. The results of Chapter 2 can be directly used by system designers to<br />

construct optimal key-trees in practice.<br />

In the second part of chapter 2, I propose a novel group based authentication scheme similar<br />

to the key-tree based method. This scheme is also based on symmetric-key cryptography, and<br />

therefore, it is well-suited to resource constrained applications in large scale environments. I<br />

analyze the proposed scheme and show that it is superior to the previous key-tree based approach<br />

for private authentication both in terms of privacy and efficiency.<br />

In chapter 3, I analyze the privacy consequences of inter vehicular communication. The promise<br />

of vehicular communications is to make road traffic safer and more efficient. However, besides the<br />

expected benefits, vehicular communications also introduce some privacy risk by making it easier to<br />

track the physical location of vehicles. One approach to solve this problem is that the vehicles use<br />

pseudonyms that they change with some frequency. In this chapter, I study the effectiveness of this<br />

approach. I define a model based on the concept of mix zone, characterize the tracking strategy<br />

of the adversary in this model, and introduce a metric to quantify the level of privacy enjoyed<br />

v


y the vehicles. I also report on the results of an extensive simulation where I used my model to<br />

determine the level of privacy achieved in realistic scenarios. In particular, in my simulation, I used<br />

a rather complex road map, generated traffic with realistic parameters, and varied the strength<br />

of the adversary by varying the number of her monitoring points. My simulation results provide<br />

information about the relationship between the strength of the adversary and the level of privacy<br />

achieved by changing pseudonyms.<br />

From the first half of Chapter 3, it can be seen that untraceability of vehicles is an important<br />

requirement in future vehicle communications systems. Unfortunately, heartbeat messages used by<br />

many safety applications provide a constant stream of location data, and without any protection<br />

measures, they make tracking of vehicles easy even for a passive eavesdropper. However, considering<br />

a global attacker, this approach is effective only if some silent period is kept during the pseudonym<br />

change and several vehicles change their pseudonyms nearly at the same time and at the same<br />

location. Unlike other works that proposed explicit synchronization between a group of vehicles<br />

and/or required pseudonym change in a designated physical area (i.e., a static mix zone), I propose<br />

a much simpler approach that does not need any explicit cooperation between vehicles and any<br />

infrastructure support. My basic idea is that vehicles should not transmit heartbeat messages when<br />

their speed drops below a given threshold, and they should change pseudonym during each such<br />

silent period. This ensures that vehicles stopping at traffic lights or moving slowly in a traffic jam<br />

will all refrain from transmitting heartbeats and change their pseudonyms nearly at the same time<br />

and location. Thus, my scheme ensures both silent periods and synchronized pseudonym change<br />

in time and space, but it does so in an implicit way. I also argue that the risk of a fatal accident at<br />

a slow speed is low, and therefore, my scheme does not seriously impact safety-of-life. In addition,<br />

refraining from sending heartbeat messages when moving at low speed also relieves vehicles of the<br />

burden of verifying a potentially large amount of digital signatures, and thus, makes it possible to<br />

implement vehicle communications with less expensive equipments.<br />

In chapter 4, I propose protocols that increase the dependability of wireless sensor networks,<br />

which are potentially useful building blocks in cyber-physical systems. Wireless sensor networks<br />

can be used in many critical applications such as martial or critical infrastructure protection<br />

scenarios. In such a critical scenario, the dependability of the monitoring sensor network can be<br />

crucial. One interesting part of the dependability of a network, is how the network can hide its<br />

nodes with specific roles from an eavesdropping or active attacker.<br />

In this problem field, I propose protocols which can hide some important nodes of the network.<br />

More specifically, I propose two privacy preserving aggregator node election protocols, a privacy<br />

preserving data aggregation protocol, and a corresponding privacy preserving query protocol for<br />

sensor networks that allow for secure in-network data aggregation by making it difficult for an<br />

adversary to identify and then physically disable the designated aggregator nodes. The basic<br />

protocol can withstand a passive attacker, while my advanced protocols resist strong adversaries<br />

that can physically compromise some nodes. The privacy preserving aggregator protocol allows<br />

electing aggregator nodes within the network without leaking any information about the identity of<br />

the elected node. The privacy preserving aggregation protocol helps collecting data by the elected<br />

aggregator nodes without leaking the information, who is actually collecting the data. The privacy<br />

preserving query protocol enables an operator to collect the aggregated data from the unknown<br />

and anonymous aggregators without leaking the identity of the aggregating nodes.<br />

vi


Kivonat<br />

Vezeték nélküli hálózatok a mindennapi élet részét képezik. Ilyen hálózatokat használhatunk<br />

például telefonálásra, Interneten elérhető szolgáltatások igénybe vételére, vagy kontaktus mentes<br />

kártyás beléptető rendszerekben. A közeljövőben a felhasználási területek jelentős mértékben ki<br />

fognak bővülni, többek között a gépjárművek is így fognak kommunikálni egymással, vagy szerepet<br />

fog kapni a kritikus infrastruktúra védelmében is.<br />

A vezeték nélküli hálózatok széleskörű használata új kutatási problémákat vet fel. Ilyen új<br />

problémakör a mobilitás, megbízhatatlan kapcsolatok kezelése, szűkös erőforrásokból származó<br />

problémák és kihívások vagy az adatvédelmi és adatbiztonsági kérdések kutatása. Ebben a disszertációban<br />

különböző vezeték nélküli hálózatok adatvédelmi kérdéseit vizsgálom.<br />

A disszertáció első fejezetében privát hitelesítési módszereket vizsgálok rádiófrekvenciás azonosítási<br />

problémák kezelésére. Tipikus alkalmazási terület az RFID rendszerek, ahol potenciálisan rengeteg<br />

felhasználó olcsó RFID kártyák segítségével hitelesítik magukat egy olvasó felé. A két hitelesítési<br />

mód a kulcsfa alapú illetve a csoport alapú azonosítás.<br />

Az első kulcsfa alapú privát hitelesítési protokollt Molnar és Wagner javasolta. Ez a módszer<br />

egy hatékony szimmetrikus kulcs alapú privát hitelesítő protokoll volt. Ez a módszer nagyon jól<br />

működik mindaddig, amíg nem kompromittálódik valamelyik felhasználó titkos kulcsai. Ekkor<br />

nemcsak a kompromittálódott felhasználó élvez kisebb anonimitást, de az összes többi felhasználó<br />

anonimitása is sérül.<br />

A disszertáció 2. fejezetében azt elemzem, hogy a fa paramétereinek gondos megválasztása<br />

hogyan tudja minimalizálni az elveszett anonimitást. Először is, definiálok egy mértéket, ami<br />

azt méri, hogy milyen hatása van annak, ha egy felhasználó kompromittálódik a rendszerben.<br />

Ez a mérték az anonimitási halmaz jól ismert fogalmára épül. Ezután megmutatom, hogy kell<br />

a kulcsfa paramétereit megválasztani úgy, hogy az előbb definiált mértékben minimális legyen a<br />

kompromittálódásból származó veszteség bizonyos külső kényszerek teljesülése mellett. Általános<br />

esetben, ahol nem csak egy felhasználó kompromittálódhat hanem több is, alsó becslést adok a<br />

rendszer által biztosított anonimitási szintre. Szimulációkkal megmutatom, hogy ez az alsó becslés<br />

jellemzően pontos becslés. A fejezet eredményei közvetlenül felhasználhatók rendszer tervezéskor,<br />

amikor meg kell találni a feladatnak legjobban megfelelő kulcsfát.<br />

2. fejezet második részében egy új csoport alapú privát hitelesítési módszert javaslok. Ez<br />

a módszer is szimmetrikus kulcsokon alapul, így jól alkalmazható erőforrás korlátozott eszközök<br />

esetén is. A fejezetben elemzem a javasolt megoldást, és megmutatom, hogy bizonyos tipikus<br />

esetekben jobban működik, mint a fejezet elején bevezetett kulcsfa alapú módszer.<br />

A 3. fejezetben a járműközi kommunikáció adatvédelmi következményeit elemzem. A közeljövőben<br />

megvalósuló járműközi kommunikáció biztonságosabb és hatékonyabb közlekedést tesz lehetővé, de<br />

ugyanakkor egyszerűbbé teszi a járművek követhetőségét is, ami jelentősen sértheti a járművezetők<br />

privát szféráját. Egy lehetséges megoldás a problémára, ha a járművek nem állandó azonosítókat<br />

használnak a kommunikációjuk során, hanem álneveket, amiket gyakran le tudnak cserélni. Ebben<br />

a fejezetben ennek a megoldásnak a hatékonyságát elemzem. Először is egy mix zóna alapú modellt<br />

alkotok. Ebben a modellben definiálom a támadó követési stratégiáját, és definiálom a mértéket,<br />

vii


ami azt méri, hogy az egyes járművek mennyire követhetők. Ezek után megvizsgálom a modellt egy<br />

részletes szimulációban. A szimuláció folyamán, egy komplex térképen valósághűen közlekednek<br />

járművek, és vizsgálom a forgalom és a támadó erősségének hatását a követhetőségre.<br />

Ahogy ez a 3. fejezet első részéből látszik, a járművek követhetősége fontos szempont a<br />

járműközi kommunikációban. Sajnálatos módon, ahogy láttuk, a folytonosan adott helyzetjelentések<br />

könnyen követhetővé teszik a járműveket. Általános megoldás a problémára, ha a járművek<br />

váltogatják az azonosítójukat. Ez a váltás, persze csak akkor tud hatékony lenni, ha a két<br />

különböző azonosító használata között eltelik legalább egy kis idő, amikor a jármű nem ad semmit,<br />

és egyszerre több egymás közelében lévő jármű vált azonosítót. Míg a legtöbb megoldás<br />

bonyolult szinkronizációt ír elő, vagy csak statikusan kijelölt helyeken engedi a cserét, addig az én<br />

megoldásom ennél sokkal egyszerűbb. Ebben a megoldásban nincs szükség explicit kooperációra<br />

vagy külső infrastruktúrára, hanem egyszerűen a járművek abbahagyják az adást egy bizonyos<br />

sebesség alatt, majd amikor átlépik ezt a küszöb sebességet, akkor újra elkezdenek adni de már az<br />

új azonosítóval. Ezáltal közlekedési lámpánál várakozó, vagy dugóban araszoló járművek egyszerre<br />

maradnak csöndben és cserélnek azonosítót. Ezáltal ez a módszer egyszerűen garantálja a szükséges<br />

csöndes periódust, és a helyileg és időben szinkronizált cserét valósit meg szinkronizáció nélkül. Ez<br />

a módszer egyrészt azért szerencsés, mivel alacsony sebességnél kicsi az esély súlyos balesetre, tehát<br />

épp akkor nem ad jeleket a jármű, amikor nincs is szükség rá, másrészt az egymáshoz közel araszoló<br />

járművek nagyon nagy mennyiségű feldolgozandó adatot generálnak, ami így szintén elkerülhető.<br />

A disszertáció 4. fejezetében protokollokat javaslok, amik növelni tudják egy vezeték nélküli<br />

szenzorhálózat megbízhatóságát. Vezeték nélküli szenzorhálózatokat fel lehet használni kritikus<br />

feladatokra is mint például hadászati vagy kritikus infrastruktúra védelem. Ilyen kritikus feladatokban,<br />

nagyon fontos lehet a kiemelt szerepű node-ok védelme illetve elrejtése támadók elől.<br />

Ezen probématerületen belül javaslok protokollokat, melyek el tudják rejteni a kulcsfontosságú<br />

eszközök kilétét. Pontosabban, két privát aggregátor választó protokollt egy privát aggregáló és<br />

egy privát lekérdező protokollt javaslok, amelyek használata esetén szenzor hálózatban támadók<br />

nem tudják azonosítani az aggregátor eszközöket. A két megoldás közül az egyszerűbb protokoll<br />

passzív lehallgatás ellen nyújt biztonságot, míg a komplexebb protokoll aktív támadások ellen is<br />

védelmet nyújt.<br />

viii


Acknowledgement<br />

First of all, I would like to express my gratitude to my supervisor, Professor Levente Buttyán,<br />

Ph.D., Departement of Telecommunication, Budapest University of Technology and Economics.<br />

He gave me guidance in selecting problems to work on, helped in elaborating the problems, and<br />

pushed me to publish the results. All these three steps were needed to finish this thesis.<br />

I am also grateful to the current and former members of the CrySyS Laboratory: Boldizsár<br />

Bencsáth, László Czap, László Csík, László Dóra, Amit Dvir, Gergely Kótyuk,<br />

Áron Lászka,<br />

Gábor Pék, Péter Schaffer, Vinh Thong Ta, and István Vajda for the illuminating discussions<br />

on different technical problems that I encountered during my research. They also provided a<br />

pleasant atmosphere which was a pleasure to work in.<br />

I would also like to thank for our joint efforts and publications to Petra Ardelean, Naim Asaj,<br />

Gildas Avoine, Danny De Cock, Stefano Cosenza, Amit Dvir, László Dóra, Julien Freudiger, Albert<br />

Held, Jean-Pierre Hubaux, Frank Kargl, Antonio Kung, Zhendong Ma, Michael Müter, Panagiotis<br />

Papadimitratos, Maxim Raya, Péter Schaffer, Elmar Schoch, István Vajda, Andre Weimerskirch,<br />

William Whyte, and Björn Wiedersheim.<br />

The financial support of the Mobile Innovation Centre (MIK) and the support of the SEVECOM<br />

(FP6-027795) and WSAN4CIP (FP7-225186) EU projects are gratefully acknowledged.<br />

And last but not least my thanks go to my wife Nóra, who accepted me as being a PhD student.<br />

I know sometimes it was not easy.<br />

ix


Contents<br />

1 Introduction 1<br />

1.1 Introduction to RFID systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2<br />

1.2 Introduction to Vehicular Ad Hoc Networks . . . . . . . . . . . . . . . . . . . . . . 3<br />

1.3 Introduction to Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . 4<br />

2 Private Authentication 9<br />

2.1 Introduction to private authentication . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />

2.2 Resistance to single member compromise . . . . . . . . . . . . . . . . . . . . . . . . 11<br />

2.3 Optimal trees in case of single member compromise . . . . . . . . . . . . . . . . . . 14<br />

2.4 Analysis of the general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20<br />

2.5 The group-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

2.6 Analysis of the group based approach . . . . . . . . . . . . . . . . . . . . . . . . . 24<br />

2.7 Comparison of the group and the key-tree based approach . . . . . . . . . . . . . . 26<br />

2.8 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27<br />

2.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br />

2.10 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28<br />

3 Location Privacy in VANETs 29<br />

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29<br />

3.2 Model of local attacker and mix zone . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />

3.2.1 The concept of the mix zone . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />

3.2.2 The model of the mix zone . . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />

3.2.3 The operation of the adversary . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />

3.2.4 Analysis of the adversary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />

3.2.5 The level of privacy provided by the mix zone . . . . . . . . . . . . . . . . . 34<br />

3.3 Simulation of mix zone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br />

3.3.1 Simulation settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br />

3.3.2 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35<br />

3.4 Global attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />

3.5 Framework for location privacy in VANETs . . . . . . . . . . . . . . . . . . . . . . 37<br />

3.6 Attacker Model and the SLOW algorithm . . . . . . . . . . . . . . . . . . . . . . . 38<br />

3.7 Analysis of SLOW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39<br />

3.7.1 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39<br />

3.7.2 Effects on safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />

3.7.3 Effects on computation complexity . . . . . . . . . . . . . . . . . . . . . . . 44<br />

3.8 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />

3.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br />

3.10 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47<br />

xi


CONTENTS<br />

4 Anonymous Aggregator Election and Data Aggregation in WSNs 49<br />

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />

4.2 System and attacker models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50<br />

4.3 Basic protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />

4.3.1 Protocol description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />

4.3.2 Protocol analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56<br />

4.3.3 Data forwarding and querying . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />

4.4 Advanced protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />

4.4.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61<br />

4.4.2 Data aggregator election . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63<br />

4.4.3 Data aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65<br />

4.4.4 Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67<br />

4.4.5 Misbehaving nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69<br />

4.5 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />

4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72<br />

4.7 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73<br />

5 Application of new results 75<br />

6 Conclusion 77<br />

xii


List of Figures<br />

2.1 Illustration of a key-tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />

2.2 Illustration of single member compromise . . . . . . . . . . . . . . . . . . . . . . . 12<br />

2.3 Illustration of several members compromise . . . . . . . . . . . . . . . . . . . . . . 20<br />

2.4 Simulation results for branching factor vectors . . . . . . . . . . . . . . . . . . . . . 22<br />

2.5 system comparison based on approximation . . . . . . . . . . . . . . . . . . . . . . 23<br />

2.6 Operation of the group-based private authentication scheme . . . . . . . . . . . . . 24<br />

2.7 Tree and group based authentication . . . . . . . . . . . . . . . . . . . . . . . . . . 24<br />

2.8 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27<br />

3.1 Mix and observed zone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />

3.2 Simplified map of Budapest generated for the simulation. . . . . . . . . . . . . . . 35<br />

3.3 Success probabilities of the adversary . . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />

3.4 Results of the simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37<br />

3.5 Success rate of a tracking attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . 40<br />

3.6 Example intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br />

3.7 Success rate of the simple attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . 43<br />

3.8 Success rate of the simple attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . 43<br />

3.9 Number of signatures to be verified . . . . . . . . . . . . . . . . . . . . . . . . . . . 45<br />

4.1 Result of aggregator election protocol . . . . . . . . . . . . . . . . . . . . . . . . . 51<br />

4.2 Probability of being cluster aggregator . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />

4.3 Probability of being cluster aggregator . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />

4.4 Result of balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />

4.5 Entropy of the attacker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59<br />

4.6 Connected dominating set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63<br />

4.7 Aggregation example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />

4.8 Query example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68<br />

4.9 Graphical representation of the suitable intervals . . . . . . . . . . . . . . . . . . . 69<br />

4.10 Misbehavior detection algorithm for the query protocol. . . . . . . . . . . . . . . . 71<br />

xiii


List of Tables<br />

2.1 Illustration of the operation of the recursive function f . . . . . . . . . . . . . . . . 19<br />

3.1 Notation in SLOW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41<br />

4.1 Estimated time of the building blocks on a Crossbow MICAz mote . . . . . . . . . 55<br />

4.2 Optimal γ values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br />

4.3 Summary of complexity of the advanced protocol . . . . . . . . . . . . . . . . . . . 61<br />

xv


List of Algorithms<br />

1 Optimal branching factor generating algorithm . . . . . . . . . . . . . . . . . . . . 19<br />

2 Basic private cluster aggregator election algorithm . . . . . . . . . . . . . . . . . . 54<br />

xvii


Chapter 1<br />

Introduction<br />

In this dissertation privacy enhancing protocols for wireless networks are proposed. In this chapter,<br />

a brief overview is given on those wireless networks to which the work presented in this dissertation<br />

is related, namely on Radio Frequency Identification systems (RFID systems), Vehicular Ad Hoc<br />

Networks (VANETs), and Wireless Sensor Networks (WSNs). The privacy consequences of the<br />

usage of such networks and some related problems are sketched. The main reason for choosing<br />

these networks is that they are or will potentially be used by billions of users, so solving a problem<br />

related to these networks can have an effect on an extremely large amount of user’s privacy.<br />

Wireless technology is a truly revolutionary paradigm shift, enabling multimedia communications<br />

between people and devices from any location. It also enables exciting applications<br />

such as sensor networks, smart homes, telemedicine, and automated highways. Comprehensive<br />

introductions to wireless networks can be found in [Goldsmith, 2005; Rappaport, 2001].<br />

The security and privacy problems of wireless networks is a well studied field, however there<br />

are a lot of open question worth to work on. Overviews of security and privacy in wireless<br />

networks can be found in [Buttyán and Hubaux, 2008; Juels, 2006; Raya and Hubaux, 2007;<br />

Akyildiz et al., 2002].<br />

A wireless network consists of nodes that can communicate through wireless channels. Those<br />

channels include Infra Red (IR) or Radio Frequency (RF) channels. From the security point of view,<br />

the main difference between wireless and traditional wired networks is that a passive attacker can<br />

easily eavesdrop the wireless channel without detection, while it can be harder with wired networks.<br />

Harder actually means here that those attacks require physical access to the network (cables or<br />

network elements), and the lack of physical protection in case of wireless networks makes these<br />

attacks easier to carry out. An active attacker can inject, modify, and delete messages in the<br />

air with some knowledge of the network and wireless technologies, while again it is harder for a<br />

traditional wired network.<br />

In information technology, privacy is defined as the right of an entity to choose which information<br />

is revealed about the entity, what information is collected and stored, how that information<br />

is used, shared or published, and also the right to keep control on that information (e.g., the<br />

right to delete data from a database if the user wishes to do so). Privacy has actually two facets:<br />

data control and data protection. One way to keep control is to keep data secret, e.g., to remain<br />

anonymous. According to [Pfitzmann and Köhntopp, 2001], anonymity is the state of being not<br />

identifiable within a set of subjects, the anonymity set. In the remaining part of the dissertation,<br />

I will use privacy with this information centric meaning, and decisional privacy 1 or intentional<br />

privacy 2 will not be discussed.<br />

1 This conception of privacy addresses issues related to an individual’s authority to make decisions that affect the<br />

individual’s life and body and that of the individual’s family members such as end of life issues. [ITLaw] 2 This<br />

conception of privacy addresses issues related to intimate activities or characteristics that are publicly visible.<br />

[ITLaw]<br />

1


1. INTRODUCTION<br />

In the remainder part of this chapter, the three wireless networks I worked with within this<br />

dissertation are introduced.<br />

1.1 Introduction to RFID systems<br />

The following description of RFID systems and its security and privacy problems is based on [Juels,<br />

2006; Langheinrich, 2009; Peris-Lopez et al., 2006]. The interested reader can get a broader view<br />

and deeper understanding on RFID systems by reading the cited papers instead of only relying on<br />

this short introduction.<br />

RFID (Radio-Frequency IDentification) is a technology for automated identification of objects<br />

or people. An RFID system consist of simple Tags, Readers, and Backend servers. The tags carry<br />

unique identifiers. These unique identifiers are read by nearby Readers by radio communication.<br />

The Readers send the obtained identifiers to Backend Servers. The goal of an RFID system is the<br />

unique identification of the holders of the Tags.<br />

Example applications of RFID systems include smart appliances, shopping, interactive objects,<br />

or medication compliance. This list can be expanded to hundreds of scenarios [Wu et al., 2009;<br />

RFID, 2012].<br />

The main threats to privacy in RFID systems are tracking and inventorying. A tracking attacker<br />

can eavesdrop message exchanges in different parts of the network. If the system is not defended<br />

against such attacks, the attacker can link different message exchanges of the same user, hence<br />

can track the user. This is a very important concern in RFID systems, that is why this problem<br />

is discussed in Chapter 2 (the problem of tracking is actually not unique to RFID systems, and I<br />

will study it in a different context in Chapter 3, namely in vehicular networks).<br />

Inventorying is a specific attack against RFID systems. It relies on the assumption that in<br />

the near future, most of our objects will be tagged with distant readable RFID tags. An attacker<br />

carrying out an inventorying attack can get know exactly what a user wears, has in her pockets or<br />

bag without the consent of the user.<br />

In Chapter 2, two private authentication methods are given, which make it difficult for an<br />

attacker to carry out tracking and inventorying attacks.<br />

Another important field of security problems regarding RFID is the authenticity of the tags. In<br />

short, the privacy problem is related to malicious readers, while the authenticity problem is related<br />

to malicious tags. The main problem is that illegitimate tags can be counterfeited to obtain the<br />

same rights as the legitimate tag holds. In the following, I will assume the presence of malicious<br />

readers, but no malicious tags is considered.<br />

When considering the RFID tags capabilities, the tags on the market can be classified into two<br />

main categories: basic tags with no real cryptographic capabilities and advanced tags with some<br />

symmetric key cryptography capabilities.<br />

Basic tags<br />

Basic RFID tags lack the resources to perform true cryptographic operations. The lack of cryptography<br />

in basic RFID tags is a big impediment to security design; cryptography, after all, is the<br />

main building block of data security. The main approaches to provide privacy to basic tags are the<br />

following: killing, sleeping, renaming, proxying, distance measurement, blocking, and legislation.<br />

Killing and sleeping are very similar approaches. The basic idea is that an authenticated<br />

command can reversibly or permanently switch off the tag.<br />

Another approach is to divide the identifier space into two separate parts by a modifiable<br />

privacy bit [Juels et al., 2003; Juels and Brainard, 2004]. The two parts are the private and the<br />

public parts. A blocker device can make the scanning of private tags infeasible, and the tags can<br />

be moved between the public and private zone on demand. Another device based solution is the<br />

proxying, where the holder of the tag can use some equipments (like a mobile phone)to enforce<br />

privacy [Floerkemeier et al., 2005; Juels et al., 2006; Rieback et al., 2005].<br />

2


1.2. Introduction to Vehicular Ad Hoc Networks<br />

The tracking problem is based on the fact, that tags use static identifiers. Some proposals<br />

suggest to rename the tags by readers [Instruments, 2005], or the tag itself can rotate pseudonyms<br />

[Juels, 2005a] to make tracking harder. In distance measurement the tags can roughly measure<br />

their distance to the reader by measuring the signal-to-noise ratio of the channel [Fishkin et al.,<br />

2005]. This can be used to avoid distant aggressive scanning.<br />

A non technical approach is legislation: There are some efforts to regulate the usage of RFID<br />

tags from the privacy point of view [Kelly and Erickson, 2005], but these efforts are far from<br />

efficient completion. Ultimately, this approach may be more effective and cost efficient than any<br />

other (e.g. from an economic aspect, it is not worth to track if the tracker can go to jail by doing<br />

so). The authentication of basic tags is as hard as providing privacy to them. There are some<br />

work [Juels, 2005b], how the kill PIN can be used to authenticate the tags.<br />

Advanced tags<br />

Advanced tags are capable of simple symmetric key operations. However weak cryptographic<br />

algorithms are targets of successfull attacks [Bono et al., 2005]. Another attack type against<br />

cryptographically enabled tags are the man-in-the-middle attacks. In a MiM attack the attacker<br />

is relaying messages between the tag and the reader and by doing so, he can modify, delete, and<br />

inject messages in their communication. This can also be done if the tag and the reader are not in<br />

vicinity [Hancke, 2005; Kfir and Wool, 2005].<br />

The privacy of advanced tags is deeply analyzed in Chapter 2. In short, the problem is that<br />

the tag is not allowed to send its identifier in order to avoid tracking, therefore the reader needs a<br />

lot of trials to find the right decryption key.<br />

The computational burden on the reader can be partly alleviated with key-trees [Molnar and<br />

Wagner, 2004], synchronization [Ohkubo et al., 2004], or time-memory tradeoffs [Avoine et al.,<br />

2005; Avoine and Oechslin, 2005]. However, all known mitigation techniques lead to degradation<br />

of privacy or efficiency. The degradation of privacy is analyzed in Chapter 2, where efficient<br />

solutions are also proposed.<br />

1.2 Introduction to Vehicular Ad Hoc Networks<br />

The following description of Vehicular Ad Hoc Networks and their security and privacy properties<br />

is based on [Raya and Hubaux, 2005; Raya and Hubaux, 2007; Lin et al., 2008; Blum et al., 2004b;<br />

Dötzer, 2006]. The interested reader can get a broader view and deeper understanding on VANETs<br />

by reading the cited papers instead of only relying on this short introduction.<br />

The main motivation to use VANETs is to enhance traffic safety, traffic efficiency, give assistance<br />

to drivers, and the possibility of infotainment applications. A VANET consist of vehicles equipped<br />

with On Board Units (OBUs) and wireless communication equipment, Road Side Units (RSUs),<br />

and backend infrastructure. The vehicles exchange messages regularly with each other and with<br />

the infrastructure using wireless communication to achieve the main goals such as safer roads.<br />

The main vulnerabilities in VANETs come from the wireless nature of the communication, and<br />

the sensitive information, such as location of users, used by the network. One major vulnerability<br />

comes from the the wireless nature of the system: the communication can be jammed easily, the<br />

messages can be forged. Another problem related to the wireless communication is that while the<br />

nodes are relaying messages, they can modify them. This is called In-Transit Traffic Tampering.<br />

Another kind of problem, that the vehicles can impersonate other vehicles with higher privileges<br />

such as emergency vehicles to gain extra privileges. The most relevant problem to this dissertation<br />

is that the privacy of the drivers of the vehicles can be violated. This vulnerability is analyzed in<br />

Chapter 3. In general an attacker can achieve her goals by tampering the OBU, an RSU, sensor<br />

readings, or the wireless channel.<br />

Traditional mechanisms cannot deal with the vulnerabilities discussed above because of the<br />

new challenges in VANETs. Such challenge is the high network volatility caused by the highly<br />

mobile very large scale network. Another challenge is that the network must offer liability and<br />

3


1. INTRODUCTION<br />

privacy at the same time in an efficient way, as the applications are delay sensitive. To make things<br />

even worse, the network is very heterogenous, different vehicles can have different equipment and<br />

abilities, so no unique solution can solve every problem.<br />

When defining the key vulnerabilities and challenges of vehicular ad hoc networks, it is crucial<br />

to first define and characterize the possible attackers. In many papers [Raya and Hubaux, 2007;<br />

Hu et al., 2005] the attacker can be characterized as follows:<br />

Insider vs. Outsider: The key difference between an insider and an outsider attacker is that<br />

an insider poses legitimate and valid cryptographic credentials, while an outsider does not<br />

have any valid credentials. It is obvious that an insider attacker can mount stronger attacks,<br />

then an outsider.<br />

Malicious vs. Rational: The main goal of a malicious attacker is to disrupt the normal operation<br />

of the network without any further goal, while a rational attacker wants to make some<br />

profit with his attack. In general, it is easier to handle a rational attacker, because his steps<br />

can be foreseen easier.<br />

Active vs. Passive: A passive attacker only eavesdrops the messages of the vehicles, while an<br />

active attacker can send, modify, or delete messages.<br />

Local vs. Global: A local attacker mounts his attack on a small area (or on some non continuous<br />

small areas), while a global attacker has influence on broader areas.<br />

In the following, some basic and sophisticated attacks are presented to give the reader an idea<br />

about the threats in vehicular ad hoc networks.<br />

An insider attacker can diffuse bogus information to affect the behavior of other drivers. The<br />

source of the information can be a cheated sensor reading or a modified location data.<br />

In wireless networking, the wormhole attack [Hu et al., 2006] consists in tunneling packets<br />

between two remote nodes. Similarly, in VANETs, an attacker that controls at least two entities<br />

remote from each other and a high speed communication link between them can tunnel packets<br />

broadcasted in one location to another, thus disseminating erroneous (but correctly signed)<br />

messages in the destination area.<br />

According to [Kroh et al., 2006] the following security concepts must be used in a vehicular<br />

ad hoc network to handle most of the possible attacks: identification and authentication concepts,<br />

privacy concepts, integrity concepts, access control and authorization concepts. The concepts are<br />

introduced in Section 3.8 with a special attention on providing privacy to the users of the system.<br />

In Chapter 3, the privacy of VANETs is analyzed, especially the privacy provided by pseudonyms<br />

considering outsider rational passive local attackers. A pseudonym change algorithm is<br />

provided as well considering an outsider rational passive global attacker.<br />

1.3 Introduction to Wireless Sensor Networks<br />

The following description of Wireless Sensor Networks (WSNs) and the related security problems<br />

is based on [Akyildiz et al., 2002; Chan and Perrig, 2003; Li et al., 2009; Lopez, 2008; Perrig et<br />

al., 2004; Sharma et al., 2012; Yick et al., 2008]. The interested reader can get a broader view and<br />

deeper understanding on WSNs by reading the cited papers instead of only relying on this short<br />

introduction.<br />

A sensor network is composed of a large number of sensor nodes, which are typically densely<br />

deployed. One sensor node consists of some sensor circuits which can measure some environmental<br />

variable, central processing unit which is typically a microcontroller, and radio circuit which<br />

enables the communication with other nearby nodes. The goal of a wireless sensor network can<br />

be one of many applications: military applications (e.g. battlefield surveillance), environmental<br />

applications (e.g. forest fire detection), critical infrastructure protection (e.g. surveillance of water<br />

pipes), health applications (e.g. drug administration in hospitals), home applications (e.g. smart<br />

environment).<br />

4


1.3. Introduction to Wireless Sensor Networks<br />

Some important security challenges in WSNs are: secure routing, secure key management, efficient<br />

(broadcast) authentication, secure localization, secure data aggregation. A good introduction<br />

to these problems and some countermeasures can be found in [Lopez, 2008].<br />

The privacy related challenges can be categorized into two main groups [Li et al., 2009]: dataoriented<br />

and context oriented challenges. In data-oriented protection, the confidentiality of the<br />

measured data must be preserved. Context oriented protection covers the location privacy of the<br />

source and some significant nodes such as the base station or aggregator nodes:<br />

Data-oriented privacy protection: Data-oriented privacy protection focuses on protecting<br />

the privacy of data content. Here ”data” refer to not only sensed data collected within a<br />

WSN but also queries posed to a WSN by users.<br />

– Privacy protection during data aggregation: Data aggregation is designed to<br />

substantially reduce the volume of traffic being transmitted in a WSN by fusing or<br />

compressing data in the intermediate sensor nodes (called aggregators). It is an important<br />

technique for preserving resources (e.g., energy consumption) in a WSN. Interestingly,<br />

it is also a common and effective method to preserve private data against<br />

an external adversary, because the process compresses large inputs to small outputs at<br />

the intermediate sensor nodes. On the other hand, a malicious aggregator can modify<br />

the measurements of many nodes with one step, or can learn the individual measurements<br />

of individual nodes. Some countermeasure are proposed in [He et al., 2007;<br />

Zhang et al., 2008].<br />

Cluster-based privacy data aggregation (CPDA): The basic idea of CPDA<br />

[He et al., 2007] is to introduce noise to the raw data sensed from a WSN, such<br />

that although an aggregator can obtain accurate aggregated information but not<br />

individual data points.<br />

Slice-mixed aggregation (SMART): The main idea of SMART [He et al., 2007]<br />

is to slice original data into pieces and recombine them randomly. This is done in<br />

three phases: slicing, mixing, and aggregation.<br />

Generic privacy-preservation solutions for approximate aggregation (GP 2 S):<br />

The basic idea of GP 2 S [Zhang et al., 2008] is to generalize the values of data transmitted<br />

in a WSN, such that although individual data content cannot be decrypted,<br />

the aggregator can still obtain an accurate estimate of the histogram of data distribution,<br />

and thereby approximate the aggregates.<br />

– Private date query: The query issued to a WSN (to retrieve the collected data)<br />

is often also of critical privacy concerns. To address this challenge, a target-region<br />

transformation technique was proposed in [Carbunar et al., 2007] to fuzzy the target<br />

region of the query according to predefined transformation functions.<br />

Context-oriented privacy protection: Context-oriented privacy protection focuses on<br />

protecting contextual information, such as the location and timing information of traffic<br />

transmitted in a WSN. Location privacy concerns may arise for such special sensor nodes as<br />

the data source and the base station. Timing privacy, on the other hand, concerns the time<br />

when sensitive data is created at a data source, collected by a sensor node and transmitted<br />

to the base station.<br />

– Location privacy: A major challenge for context-oriented privacy protection is that<br />

an adversary may be able to compromise private information even without the ability<br />

of decrypting the transmitted data. In particular, since hop-by-hop transmission is<br />

required to address the limited transmission range of sensor nodes, an adversary may<br />

derive the locations of important nodes and data sources by observing and analyzing<br />

the traffic patterns between different hops.<br />

Location privacy of data source: In event driven networks, an event is generated<br />

if something interesting happens in the vicinity of the node. In some networks, the<br />

5


1. INTRODUCTION<br />

only data sent to the base station is the occurrence of the event. Thus the presence<br />

of communication reveals the location of the event. In some situations, it must be<br />

hidden from an attacker. Some approaches are described in the following:<br />

Baseline and probabilistic flooding mechanisms: The basic idea of baseline<br />

flooding is for each sensor to broadcast the data it receives from one neighbor<br />

to all of its other neighbors. The premise of this approach is that all sensors<br />

participate in the data transmission so that it is unlikely for an attacker to<br />

track a path of transmission back to the data source [Kamat et al., 2005]. This<br />

can be further optimized if not every node rebroadcasts the message, only a<br />

probabilistic set of them.<br />

Random walk mechanisms: According to [Kamat et al., 2005], a random<br />

walk can be performed before the probabilistic flooding to further increase the<br />

uncertainty of the attacker. To improve simple random walk, a two-way greedy<br />

random walk(GROW) scheme was proposed in [Xi et al., 2006].<br />

Dummy data mechanism: To further protect the location of the data source,<br />

fake data packets can be introduced to perturb the traffic patterns observed by<br />

the adversary. In particular, a simple scheme called Short-lived Fake Source<br />

Routing was proposed in [Kamat et al., 2005] for each sensor to send out a fake<br />

packet with a pre-determined probability.<br />

Fake data sources mechanism: The basic idea of fake data source is to<br />

choose one or more sensor node to simulate the behavior of a real data source<br />

in order to confuse the adversaries [Mehta et al., 2007].<br />

Location privacy of base station: In a WSN, a base station is not only in<br />

charge of collecting and analyzing data, but also used as the gateway connecting the<br />

WSN with outside wireless or wired network. Consequently, destroying or isolating<br />

the base station may lead to the malfunction of the entire network. This can be<br />

circumvented if the location of the base station is unknown to the adversary.<br />

Defense against local adversaries: The location information or identifier<br />

of the base station is sent in clear in many protocols. These information must be<br />

hidden from an eavesdropper, which can be done by traditional cryptographic<br />

techniques (encryption). Another problem can be if the attacker can follow<br />

the way of packets from the source towards the base station. This can be<br />

mitigated by changing data appearance by re-encryption [Deng et al., 2006a;<br />

Dingledine et al., 2004], routing with multiple parents [Deng et al., 2005; Deng et<br />

al., 2006a], routing with random walk [Jian et al., 2007], or decorrelating parentchild<br />

relationship by randomly selecting sending time [Deng et al., 2006a].<br />

Defense against global adversaries: The techniques discussed above are<br />

inefficient against a global attacker. To fight against a global attacker the<br />

traffic patterns of the whole network must be modified. This can be done by<br />

hiding traffic pattern by controlling transmission rate [Deng et al., 2006a], or<br />

by propagating dummy data [Deng et al., 2005; Deng et al., 2006a].<br />

– Temporal privacy problem: When an adversary eavesdrops a message, it can<br />

deduce the sending time of the message from the time it eavesdropped and the TTL<br />

value. In some applications this information must be hidden. It can be done by randomly<br />

delaying the messages by the relaying nodes [Kamat et al., 2007].<br />

As it can be seen from the discussion above, a considerable amount of work has been done in the<br />

field of privacy in wireless sensor networks. However, the particular problem of location privacy<br />

of aggregator nodes received less attention. Therefore, in Chapter 4, I study this problem and<br />

propose two anonym aggregator election protocols, which can hide the identity of the aggregator<br />

nodes.<br />

6


1.3. Introduction to Wireless Sensor Networks<br />

The remainder of the dissertation is organized as follows: In Chapter 2, I propose two private<br />

authentication schemes for resource limited systems, such as RFID systems. The results presented<br />

in Chapter 2 have been published in [Buttyan et al., 2006a; Buttyan et al., 2006b; Avoine et<br />

al., 2007]. In Chapter 3, I analyze the privacy achieved by pseudonym changing techniques in<br />

vehicular ad hoc networks, and propose a pseudonym changing algorithm for VANETs. All results<br />

of Chapter 3 have been published in [Buttyan et al., 2007; Papadimitratos et al., 2008; Holczer et<br />

al., 2009; Buttyan et al., 2009]. In Chapter 4, I analyze how an aggregator node can be elected and<br />

used in wireless sensor networks without revealing its identity. All results of Chapter 4 have been<br />

published in [Buttyán and Holczer, 2009; Buttyán and Holczer, 2010; Holczer and Buttyán, 2011;<br />

Schaffer et al., 2012]. The possible application of the new results can be found in Chapter 5, while<br />

Chapter 6 concludes the dissertation.<br />

7


Chapter 2<br />

Private Authentication in Resource<br />

Constrained Environments<br />

2.1 Introduction to private authentication<br />

Entity authentication is the process whereby a party (the prover) corroborates its identity to<br />

another party (the verifier). Entity authentication is often based on authentication protocols in<br />

which the parties pass messages to each other. These protocols are engineered in such a way that<br />

they resist various types of impersonation and replay attacks [Boyd and Mathuria, 2003]. However,<br />

less attention is paid to the requirement of preserving the privacy of the parties (typically that of<br />

the prover) with respect to an eavesdropping third party. Indeed, in many of the well-known and<br />

widely used authentication protocols (e.g., [ISO, 2008; Kohl and Neuman, 1993]) the identity of<br />

the prover is sent in cleartext, and hence, it is revealed to an eavesdropper.<br />

One approach to solve this problem is based on public key cryptography, and it consists of<br />

encrypting the identity information of the prover with the public key of the verifier so that no<br />

one but the verifier can learn the prover’s identity [Abadi and Fournet, 2004]. Another approach,<br />

also based on public key techniques, is that the parties first run an anonymous Diffie-Hellman key<br />

exchange and establish a confidential channel, through which the prover can send its identity and<br />

authentication information to the verifier in a second step. An example for this second approach is<br />

the main mode of the Internet Key Exchange (IKE and IKEv2) protocol [Harkins and Carrel, 1998;<br />

Black and McGrew, 2008]. While it is possible to hide the identity of the prover by using the above<br />

mentioned approaches, they provide appropriate solution to the problem only if the parties can<br />

afford public key cryptography. In many applications, such as low cost RFID tags and contactless<br />

smart card based automated fare collection systems in mass transportation, this is not the case,<br />

while at the same time, the provision of privacy (especially location privacy) in those systems is<br />

strongly desirable.<br />

The problem of using symmetric key encryption to hide the identity of the prover is that<br />

the verifier does not know which symmetric key it should use to decrypt the encrypted identity,<br />

because the appropriate key cannot be retrieved without the identity. The verifier may try all<br />

possible keys in its key database until one of them properly decrypts the encrypted identity 1 , but<br />

this would increase the authentication delay if the number of potential provers is large. Long<br />

authentication delays are usually not desirable, moreover, in some cases, they may not even be<br />

acceptable. As an example, let us consider again contactless smart card based electronic tickets<br />

in public transportation: the number of smart cards in the system (i.e., the number of potential<br />

provers) may be very large in big cities, while the time needed to authenticate a card should be<br />

short in order to ensure a high throughput of passengers and avoid long queues at entry points.<br />

1 This of course requires redundancy in the encrypted message so that the verifier can determine if the decryption<br />

was successful.<br />

9


2. PRIVATE AUTHENTICATION<br />

Some years ago, Molnar and Wagner proposed an elegant approach to privacy protecting authentication<br />

[Molnar and Wagner, 2004] that is based on symmetric key cryptography while still<br />

ensuring short authentication delays. More precisely, the complexity of the authentication procedure<br />

in the Molnar-Wagner scheme is logarithmic in the number of potential provers, in contrast<br />

with the linear complexity of the naïve key search approach. The main idea of Molnar and Wagner<br />

is to use key-trees (see Figure 2.1 for illustration). A key-tree is a tree where a unique key is assigned<br />

to each edge. The leaves of the tree represent the potential provers, which is called members<br />

in the sequel. Each member possesses the keys assigned to the edges of the path starting from the<br />

root and ending in the leaf that corresponds to the given member. The verifier knows all keys in<br />

the tree. In order to authenticate itself, a member uses all of its keys, one after the other, starting<br />

from the first level of the tree and proceeding towards lower levels. The verifier first determines<br />

which first level key has been used. For this, it needs to search through the first level keys only.<br />

Once the first key is identified, the verifier continues by determining which second level key has<br />

been used. However, for this, it needs to search through those second level keys only that reside<br />

below the already identified first level key in the tree. This process is continued until all keys are<br />

identified, which at the end, identify the authenticating member. The key point is that the verifier<br />

can reduce the search space considerably each time a key is identified, because it should consider<br />

only the subtree below the recently identified key.<br />

k111<br />

k11<br />

k1<br />

Figure 2.1: Illustration of a key-tree. There is a unique key assigned to each edge. Each leaf<br />

represents a member of the system that possesses the keys assigned to the edges of the path<br />

starting from the root and ending in the given leaf. For instance, the member that belongs to the<br />

leftmost leaf in the figure possesses the keys k1, k11, and k111.<br />

The problem of the above described tree-based approach is that upper level keys in the tree are<br />

used by many members, and therefore, if a member is compromised and its keys become known<br />

to the adversary, then the adversary gains partial knowledge of the key of other members too<br />

[Avoine et al., 2005]. This obviously reduces the privacy provided by the system to its members,<br />

since by observing the authentication of an uncompromised member, the adversary can recognize<br />

the usage of some compromised keys, and therefore its uncertainty regarding the identity of the<br />

authenticating member is reduced (it may be able to determine which subtree the member belongs<br />

to).<br />

One interesting observation is that the naïve, linear key search approach can be viewed as a<br />

special case of the key-tree based approach, where the key-tree has a single level and each member<br />

has a single key. Regarding the above described problem of compromised members, the naïve<br />

approach is in fact optimal, because compromising a member does not reveal any key information<br />

of other members. At the same time, as described above, the authentication delay is the worst in<br />

this case. On the other hand, in case of a binary key-tree, it can be observed that the compromise<br />

of a single member strongly 2 affects the privacy of the other members, while at the same time,<br />

the binary tree is very advantageous in terms of authentication delay. Thus, there seems to be a<br />

trade-off between the level of privacy provided by the system and the authentication delay, which<br />

depends on the parameters of the key-tree, but it is far from obvious to see how the optimal<br />

2 The precise quantification of this effect is the topic of this chapter and will be presented later.<br />

10


2.2. Resistance to single member compromise<br />

key-tree should look like. In this chapter, I address this problem, and I show how to find optimal<br />

key-trees.<br />

In this chapter, after finding the optimal key-tree, I go further and I present a novel symmetrickey<br />

private authentication scheme that provides a higher level of privacy and achieves better<br />

efficiency than the key-tree based approach. This approach is called the group based approach.<br />

More precisely, the complexity of the group based scheme for the reader can be set to be O(log N)<br />

(i.e., the same as in the key-tree based approach), while the complexity for the tags is always a<br />

constant (in contrast to O(log N) of the key-tree based approach). Hence, the group based scheme<br />

is better than the key-tree based scheme both in terms of privacy and efficiency, and therefore, it<br />

is a serious alternative to the key-tree based scheme to be considered by the RFID community.<br />

More precisely, the main contributions are the following:<br />

I propose a benchmark metric for measuring the resistance of the system to a single compromised<br />

member based on the concept of anonymity sets. To the best of my knowledge,<br />

anonymity sets have not been used in the context of private authentication yet. I prove that<br />

this simply defined metric is equivalent to a metric widely used in cryptography with a much<br />

more complex definition. The real contribution of the metric, is that its definition simplifies<br />

the usage of the metric without losing any details of the more complex metric.<br />

I introduce the idea of using different branching factors at different levels of the key-tree;<br />

the advantage is that the system’s resistance to single member compromise can be increased<br />

while still keeping the authentication delay short. To the best of my knowledge, key-trees<br />

with variable branching factors have not been proposed yet for private authentication.<br />

I present an algorithm for determining the optimal parameters of the key-tree, where optimal<br />

means that resistance to single member compromise is maximized, while the authentication<br />

delay is kept below a predefined threshold.<br />

In the general case, when any member can be compromised, I give a lower bound on the<br />

level of privacy provided by the system, and present some simulation results that show that<br />

this lower bound is quite sharp. This allows me to compare different systems based on their<br />

lower bounds.<br />

I introduce a group based approach, which is superior to the tree-based approach in many<br />

properties.<br />

In summary, I propose practically usable techniques for designers of RFID based authentication<br />

systems.<br />

The outline of the chapter is the following: in Section 2.2, I introduce my benchmark metric<br />

to measure the level of privacy provided by key-tree or group based authentication systems, and<br />

I illustrate, through an example, how this metric can be used to compare systems with different<br />

parameters. By the same token, I also show that key-trees with variable branching factors can be<br />

better than key-trees with a constant branching factor at every level. In Section 2.3, I formulate<br />

the problem of finding the best key-tree with respect to my benchmark metric as an optimization<br />

problem, and I present an algorithm that solves that optimization problem. In Section 2.4, I<br />

consider the general case, when any number of members can be compromised, and I derive a useful<br />

lower bound on the level of privacy provided by the system. After finding the optimal key-tree, I<br />

describe the operation of my group based scheme in Section 2.5, and I quantify the level of privacy<br />

that it provides in Section 2.6. I compare the group based scheme to the key-tree based approach<br />

in Section 2.7. Finally, in Section 2.8, I report on some related work, and in Section 2.9, I conclude<br />

the chapter.<br />

2.2 Resistance to single member compromise<br />

There are different ways to measure the level of anonymity provided by a system [Diaz et al., 2002;<br />

Serjantov and Danezis, 2003]. Here the concept of anonymity sets [Chaum, 1988] is used. The<br />

11


2. PRIVATE AUTHENTICATION<br />

anonymity set of a member v is the set of members that are indistinguishable from v from the<br />

adversary’s point of view. The size of the anonymity set is a good measure of the level of privacy<br />

provided for v, because it is related to the level of uncertainty of the adversary, if all members<br />

of the set are equiprobably likely (otherwise an entropy based metric can be used). Clearly, the<br />

larger the anonymity set is, the higher the level of privacy is. The minimum size of the anonymity<br />

set is 1, and its maximum size is equal to the number of all members in the system. In order to<br />

make the privacy measure independent of the number of members, one can divide the anonymity<br />

set size by the total number of members, and obtain a normalized privacy measure between 0 and<br />

1. Such normalization makes the comparison of different systems easier.<br />

Now, let us consider a key-tree with ℓ levels and branching factors b1, b2, . . . , bℓ at the levels, and<br />

let us assume that exactly one member is compromised (see Figure 2.2 for illustration). Knowledge<br />

of the compromised keys allows the adversary to partition the members into subsets P0, P1, P2, . . .,<br />

where<br />

P0 contains the compromised member only,<br />

P1 contains the members the parent of which is the same as that of the compromised member,<br />

and that are not in P0,<br />

P2 contains the members the grandparent of which is the same as that of the compromised<br />

member, and that are not in P0 ∪ P1,<br />

etc.<br />

Members of a given subset are indistinguishable for the adversary, while it can distinguish between<br />

members that belong to different subsets. Hence, each subset is the anonymity set of its members.<br />

k111<br />

k11<br />

k1<br />

P0 P1 P2 P3<br />

Figure 2.2: Illustration of what happens when a single member is compromised. Without loss<br />

of generality, it is assumed that the member corresponding to the leftmost leaf in the figure is<br />

compromised. This means that the keys k1, k11, and k111 become known to the adversary. This<br />

knowledge of the adversary partitions the set of members into anonymity sets P0, P1, . . . of different<br />

sizes. Members that belong to the same subset are indistinguishable to the adversary, while it can<br />

distinguish between members that belong to different subsets. For instance, the adversary can<br />

recognize a member in subset P1 by observing the usage of k1 and k11 but not that of k111, where<br />

each of these keys are known to the adversary. Members in P3 are recognized by not being able to<br />

observe the usage of any of the keys known to the adversary.<br />

The level of privacy provided by the system can be characterized by the level of privacy provided<br />

to a randomly selected member, or in other words, by the expected size of the anonymity set of a<br />

randomly selected member. By definition, the expected anonymity set size is:<br />

¯S =<br />

ℓ∑<br />

i=0<br />

|Pi|<br />

N |Pi| =<br />

12<br />

ℓ∑<br />

i=0<br />

|Pi| 2<br />

N<br />

(2.1)


2.2. Resistance to single member compromise<br />

where N is the total number of members, and |Pi|/N is the probability of selecting a member from<br />

subset Pi. The resistance to single member compromise, denoted by R, is defined as the normalized<br />

expected anonymity set size, which can be computed as follows:<br />

R = ¯ S<br />

N =<br />

=<br />

=<br />

where it is used that<br />

1<br />

N 2<br />

1<br />

N 2<br />

ℓ∑<br />

i=0<br />

|Pi| 2<br />

N 2<br />

( 1 + (bℓ − 1) 2 + ((bℓ−1 − 1)bℓ) 2 + . . . + ((b1 − 1)b2b3 . . . bℓ) 2)<br />

⎛<br />

⎝1 + (bℓ − 1) 2 ∑ℓ−1<br />

+ (bi − 1) 2<br />

i=1<br />

|P0| = 1<br />

|P1| = bℓ − 1<br />

ℓ∏<br />

j=i+1<br />

|P2| = (bℓ−1 − 1)bℓ<br />

b 2 j<br />

|P3| = (bℓ−2 − 1)bℓ−1bℓ<br />

. . . . . .<br />

|Pℓ| = (b1 − 1)b2b3 . . . bℓ<br />

⎞<br />

⎠ (2.2)<br />

As its name indicates, R characterizes the loss of privacy due to the compromise of a single<br />

member of the system. If R is close to 1, then the expected anonymity set size is close to the total<br />

number of members, and hence, the loss of privacy is small. On the other hand, if R is close to<br />

0, then the loss of privacy is high, as the expected anonymity set size is small. R is used as a<br />

benchmark metric based on which different systems can be compared.<br />

This metric can be seen as being a little ad hoc, but actually the same metric is used in other<br />

papers like [Avoine et al., 2005] with a different more complex definition:<br />

Theorem 1. The expected anonymity set size based metric (R) is complement to the one tag<br />

tampering based metric (M) defined in [Avoine et al., 2005].<br />

Proof. The metric M used in [Avoine et al., 2005] is defined in that paper as:<br />

1. The attacker has one tag T0 (e.g., her own) she can tamper with and thus obtain its complete<br />

secret. For the sake of calculation simplicity, we assume that T0 is put back into circulation.<br />

When the number of tags in the system is large, this does not significantly affect the results.<br />

2. She then chooses a target tag T. She can query it as much as she wants but she cannot<br />

tamper with it.<br />

3. Given two tags T1 and T2 such that T ∈ {T1, T2}, we say that the attacker succeeds if she<br />

definitely knows which of T1 and T2 is T . We define the probability to trace T as being the<br />

probability that the attacker succeeds. To do that, the attacker can query T1 and T2 as many<br />

times as she wants but, obviously, cannot tamper with them.<br />

In the following P1 . . . Pk are the subsets of the tags after the compromise of some tags<br />

( ∑k i=1 Pi = N).<br />

In the third step, the attacker can be successful if (and only if) T1 and T2 belongs to different<br />

subsets.<br />

The probability of the attacker’s success is the probability that two randomly chosen tags<br />

belongs to two different subsets. This probability can be calculated as follows:<br />

M = 1 − Pr(T1, T2 are in P1) − . . . − Pr(T1, T2 are in Pk) = 1 −<br />

This is the complement of the metric R (M + R = 1).<br />

13<br />

k∑<br />

i=1<br />

( ) 2<br />

Pi<br />

N


2. PRIVATE AUTHENTICATION<br />

Obviously, a system with greater R is better, and therefore, one would like to maximize R (and<br />

at the same time minimize M). However, there are some constraints. The maximum authentication<br />

delay, denoted by D, is defined as the number of basic operations needed to authenticate any<br />

member in the worst case. The maximum authentication delay in case of key-tree based authenti-<br />

cation can be computed as D = ∑ ℓ<br />

i=1 bi. In most practical cases, there is an upper bound Dmax<br />

on the maximum authentication delay allowed in the system. For instance, in the specification<br />

for electronic ticketing systems for public transport applications in Hungary [Berki, 2008], it is<br />

required that a ticket validation transaction should be completed in 250 ms. Taking into account<br />

the details of the ticket validation protocol, one can derive Dmax for electronic tickets from such<br />

specifications. Therefore, in practice, the designer’s task is to maximize R under the constraint<br />

that D ≤ Dmax. This problem is addressed in Section 2.3.<br />

In the remainder of this section, I illustrate how the benchmark metric R can be used to<br />

compare different systems. This exercise will also lead to an important revelation: key-trees with<br />

varying branching factors at different levels could provide higher level of privacy than key-trees<br />

with a constant branching factor, while having the same or even a shorter authentication delay.<br />

Example: Let us assume that the total number N of members is 27000 and the upper bound Dmax<br />

on the maximum authentication delay is 90. Let us consider a key-tree with a constant branching<br />

factor vector B = (30, 30, 30), and another key-tree with branching factor vector B ′ = (60, 10, 9, 5).<br />

Both key-trees can serve the given population of members, since 30 3 = 60 · 10 · 9 · 5 = 27000.<br />

In addition, both key-trees ensure that the maximum authentication delay is not longer than<br />

Dmax: for the first key-tree, we have D = 3 · 30 = 90, whereas for the second one, we get<br />

D = 60+10+9+5 = 84. Using (2.2), we can compute the resistance to single member compromise<br />

for both key-trees. For the first tree, we get R ≈ 0.9355, while for the second tree we obtain<br />

R ≈ 0.9672. Thus, we can arrive to the conclusion that the second key-tree with variable branching<br />

factors is better, as it provides a higher level of privacy, while ensuring a smaller authentication<br />

delay.<br />

At this point, several questions arise naturally: Is there an even better branching factor vector<br />

than B ′ for N = 27000 and Dmax = 90? What is the best branching factor vector for this case?<br />

How can we find the best branching factor vector in general? I give the answers to these questions<br />

in the next section.<br />

2.3 Optimal trees in case of single member compromise<br />

The problem of finding the best branching factor vector can be described as an optimization<br />

problem as follows: Given the total number N of members and the upper bound Dmax on the<br />

maximum authentication delay, find a branching factor vector B = (b1, b2, . . . bℓ) such that R(B)<br />

is maximal subject to the following constraints:<br />

ℓ∏<br />

bi = N (2.3)<br />

i=1<br />

ℓ∑<br />

bi ≤ Dmax (2.4)<br />

i=1<br />

This optimization problem is analyzed through a series of lemmas that will lead to an algorithm<br />

that solves the problem. The first lemma states that we can always improve a branching factor<br />

vector by ordering its elements in decreasing order, and hence, in the sequel only ordered vectors<br />

are considered:<br />

Lemma 1. Let N and Dmax be the total number of members and the upper bound on the<br />

maximum authentication delay, respectively. Moreover, let B be a branching factor vector and let<br />

B ∗ be the vector that consists of the sorted permutation of the elements of B in decreasing order.<br />

If B satisfies the constraints of the optimization problem defined above, then B ∗ also satisfies<br />

them, and R(B ∗ ) ≥ R(B).<br />

14


2.3. Optimal trees in case of single member compromise<br />

Proof. B ∗ has the same elements as B has, therefore, the sum and the product of the elements of<br />

B ∗ are the same as that of B, and so if B satisfies the constraints of the optimization problem,<br />

then B ∗ does so too.<br />

Now, let us assume that B ∗ is obtained from B with the bubble sort algorithm. The basic step<br />

of this algorithm is to change two neighboring elements if they are not in the right order. Let us<br />

suppose that bi < bi+1, and thus, the algorithm changes the order of bi and bi+1. Then, using<br />

(2.2), we can express ∆R = R(B ∗ ) − R(B) as follows:<br />

∆R = 1<br />

N 2<br />

⎛<br />

⎝(bi+1 − 1) 2 b 2 i<br />

=<br />

=<br />

1<br />

N 2<br />

⎛<br />

⎝(bi − 1) 2 b 2 i+1<br />

∏ ℓ<br />

j=i+2 b2 j<br />

ℓ∏<br />

j=i+2<br />

ℓ∏<br />

j=i+2<br />

b 2 j + (bi − 1) 2<br />

b 2 j + (bi+1 − 1) 2<br />

ℓ∏<br />

j=i+2<br />

ℓ∏<br />

b 2 j<br />

j=i+2<br />

⎞<br />

⎠ −<br />

N 2<br />

(<br />

(bi+1 − 1) 2 b 2 i + (bi − 1) 2 − (bi − 1) 2 b 2 i+1 − (bi+1 − 1) 2)<br />

∏ℓ j=i+2 b2j N 2<br />

(<br />

(bi+1 − 1) 2 (b 2 i − 1) − (bi − 1) 2 (b 2 i+1 − 1) )<br />

= (bi − 1)(bi+1 − 1) ∏ℓ j=i+2 b2j N 2<br />

((bi+1 − 1)(bi + 1) − (bi − 1)(bi+1 + 1))<br />

Since bi ≥ 2 for all i, ∆R is non-negative if<br />

bi + 1<br />

bi − 1 ≥ bi+1 + 1<br />

bi+1 − 1<br />

But (2.5) must hold, since the function f(x) = x+1<br />

x−1 is a monotone decreasing function, and by<br />

assumption, bi < bi+1. This means, that when sorting the elements of B, we improve R(B) in<br />

every step, and thus, R(B∗ ) ≥ R(B) must hold. ⋄<br />

The following lemma provides a lower bound and an upper bound for the resistance to single<br />

member compromise:<br />

Lemma 2. Let B = (b1, b2, . . . bℓ) be a sorted branching factor vector (i.e., b1 ≥ b2 ≥ . . . ≥ bℓ).<br />

We can give the following lower and upper bounds on R(B):<br />

Proof. By definition<br />

(<br />

1 − 1<br />

b1<br />

) 2<br />

≤ R(B) ≤<br />

R = 1<br />

N 2<br />

⎛<br />

⎝1 + (bℓ − 1) 2 +<br />

=<br />

( b1 − 1<br />

b1<br />

) 2<br />

(<br />

1 − 1<br />

∑ℓ−1<br />

(bi − 1) 2<br />

i=1<br />

b1<br />

+ 1<br />

N 2<br />

⎛<br />

⎝1 + (bℓ − 1) 2 +<br />

) 2<br />

ℓ∏<br />

j=i+1<br />

b 2 j<br />

+ 4<br />

3b 2 1<br />

b 2 j<br />

⎞<br />

⎠<br />

⎞<br />

⎠<br />

∑ℓ−1<br />

(bi − 1) 2<br />

i=2<br />

ℓ∏<br />

j=i+1<br />

b 2 j<br />

⎞<br />

(2.5)<br />

(2.6)<br />

⎠ (2.7)<br />

where it is used that N = b1b2 . . . bℓ. The lower bound in the lemma 3 follows directly from (2.7).<br />

3<br />

( ) 2<br />

b1−1<br />

Note that we could also derive the slightly better lower bound of + b1<br />

1<br />

N 2 from (2.7), however, we do not<br />

need that in this chapter.<br />

15


2. PRIVATE AUTHENTICATION<br />

In order to obtain the upper bound, we can write bi instead of (bi − 1) in the sum in (2.7):<br />

R <<br />

=<br />

( b1 − 1<br />

b1<br />

( b1 − 1<br />

b1<br />

b1<br />

) 2<br />

) 2<br />

+ 1<br />

N 2<br />

⎛<br />

⎝1 +<br />

⎛<br />

+ 1<br />

b 2 1<br />

+ 1<br />

b 2 1<br />

⎝1 +<br />

ℓ∑<br />

ℓ∏<br />

i=2 j=i<br />

ℓ∑<br />

i∏<br />

b 2 j<br />

b<br />

i=2 j=2<br />

2 j<br />

i=2<br />

⎞<br />

⎠<br />

⎞<br />

1<br />

⎠<br />

Since bi ≥ 2 for all i, we can write 2 in place of bi in the sum, and we obtain:<br />

R <<br />

( ) 2<br />

b1 − 1<br />

+<br />

b1<br />

1<br />

b2 =<br />

⎛<br />

⎞<br />

ℓ∑ i∏<br />

⎝1<br />

1<br />

+ ⎠<br />

1<br />

4<br />

i=2 j=2<br />

( ) 2<br />

b1 − 1<br />

+<br />

b1<br />

1<br />

b2 <<br />

(<br />

ℓ∑<br />

( ) )<br />

i−1<br />

1<br />

1 +<br />

1<br />

4<br />

i=2<br />

( ) (<br />

2<br />

∞∑<br />

( ) )<br />

i−1<br />

b1 − 1<br />

1<br />

1 +<br />

4<br />

=<br />

( b1 − 1<br />

and this is the upper bound in the lemma. ⋄<br />

b1<br />

) 2<br />

+ 1<br />

b 2 1<br />

1<br />

1 − 1<br />

4<br />

Let us consider the bounds in Lemma 2. Note that the branching factor vector is ordered,<br />

therefore, b1 is not smaller than any other bi. We can observe that if we increase b1, then the<br />

difference between the upper and the lower bounds decreases, and R(B) gets closer to 1. Intuitively,<br />

this implies that in order to find the solution to the optimization problem, b1 should be maximized.<br />

The following lemma confirms this intuition formally:<br />

Lemma 3. Let N and Dmax be the total number of members and the upper bound on the maximum<br />

authentication delay, respectively. Moreover, let B = (b1, b2, . . . , bℓ) and B ′ = (b ′ 1, b ′ 2, . . . , b ′ ℓ ′)<br />

be two sorted branching factor vectors that satisfy the constraints of the optimization problem<br />

defined above. Then, b1 > b ′ 1 implies R(B) ≥ R(B ′ ).<br />

Proof. First, we can prove that the statement of the lemma is true if b ′ 1 ≥ 5. We know from<br />

Lemma 2 that<br />

R(B ′ (<br />

) < 1 − 1<br />

b ′ ) 2<br />

+<br />

1<br />

4<br />

3b ′ 2<br />

1<br />

and<br />

R(B) ><br />

(<br />

1 − 1<br />

b1<br />

) 2<br />

(<br />

≥ 1 − 1<br />

b ′ 1 + 1<br />

where we used that b1 > b ′ 1 by assumption. If we can prove that<br />

(<br />

1 − 1<br />

b ′ ) 2<br />

1<br />

+ 4<br />

3b ′ (<br />

≤ 2<br />

1<br />

1 − 1<br />

b ′ 1 + 1<br />

then we also proved that R(B ′ ) ≤ R(B). Indeed, a straightforward calculation yields that (2.8) is<br />

true if b ′ √<br />

15<br />

1 ≥ 2 + 2 , and since b′ 1 is an integer, we are done.<br />

Next, we can make the observation that a branching factor vector A = (a1, . . . , ak, 2, 2) that<br />

has at least two 2s at the end can be improved by joining two 2s into a 4 and obtaining A ′ =<br />

16<br />

) 2<br />

) 2<br />

(2.8)


2.3. Optimal trees in case of single member compromise<br />

(a1, . . . , ak, 4). It is clear that neither the sum nor the product of the elements changes with this<br />

transformation. In addition, we can use the definition of R to get<br />

and<br />

N 2 · R(A) = ((a1 − 1) · a2 · . . . · ak · 2 · 2) 2 + . . . + ((ak − 1) · 2 · 2) 2 +<br />

((2 − 1) · 2) 2 + (2 − 1) 2 + 1<br />

N 2 · R(A ′ ) = ((a1 − 1) · a2 · . . . · ak · 4) 2 + . . . + ((ak − 1) · 4) 2 +<br />

(4 − 1) 2 + 1<br />

Thus, R(A ′ ) − R(A) = 1<br />

N 2 (9 − 4 − 1) > 0, which means that A ′ is better than A.<br />

Now, that is proven that the lemma is also true for b ′ 1 ∈ {2, 3, 4}:<br />

b ′ 1 = 2: Since B ′ is an ordered vector where b ′ 1 is the largest element, it follows that every<br />

element of B ′ is 2, and thus, N is a power of 2. From Lemma 2, R(B ′ ) < (1− 1<br />

2 )2 + 4<br />

3·22 = 7<br />

12<br />

and R(B) > (1 − 1<br />

b1 )2 . It is easy to see that (1 − 1<br />

b1 )2 ≥ 7<br />

12 if b1<br />

1 ≥<br />

1− √ = 4.23. Since<br />

7<br />

12<br />

b1 > b ′ 1, the remaining cases are b1 = 3 and b1 = 4. However, b1 = 3 cannot be the case,<br />

because N is a power of 2. If b1 = 4, then B can be obtained from B ′ by joining pairs of<br />

2s into 4s and then ordering the elements. However, according to the observation above and<br />

Lemma 1, both operations improve the vector. It follows that R(B) ≥ R(B ′ ) must hold.<br />

b ′ 1 = 3: From Lemma 2, R(B ′ ) < (1 − 1<br />

3 )2 + 4<br />

3·32 = 16<br />

27<br />

that (1 − 1<br />

b1 )2 ≥ 16<br />

27 if b1 ≥<br />

In this case, the vectors are as follows:<br />

and R(B) > (1 − 1<br />

b1 )2 . It is easy to see<br />

9<br />

9−4· √ 3 = 4.34. Since b1 > b ′ 1, the only remaining case is b1 = 4.<br />

<br />

B = ( 2 2 , . . . , 2 2 ,<br />

i<br />

B ′ <br />

= ( 3, . . . , 3,<br />

j<br />

j<br />

<br />

3, . . . , 3,<br />

2i+k<br />

<br />

2, . . . , 2)<br />

k<br />

<br />

2, . . . , 2)<br />

where i, j ≥ 1 and k ≥ 0. This means that B can be obtained from B ′ by joining i pairs of<br />

2s into 4s and then ordering the elements. However, as we saw earlier, both joining 2s into<br />

4s and ordering the elements improve the vector, and thus, R(B) ≥ R(B ′ ) must hold.<br />

b ′ 1 = 4: Since B ′ is an ordered vector where b ′ 1 is the largest element, it follows that N is not<br />

divisible by 5. From Lemma 2, R(B ′ ) < (1 − 1<br />

4 )2 + 4<br />

3·42 = 31<br />

48<br />

easy to see that (1 − 1<br />

b1 )2 ≥ 31<br />

48 if b1 ≥<br />

1<br />

1− √ 31<br />

48<br />

and R(B) > (1 − 1<br />

b1 )2 . It is<br />

= 5.09. Since b1 > b ′ 1, the remaining case is<br />

b1 = 5. However, b1 = 5 cannot be the case, because N is not divisible by 5. ⋄<br />

Lemma 3 states that given two branching factor vectors, the one with the larger first element is<br />

always at least as good as the other. The next lemma generalizes this result by stating that given<br />

two branching factor vectors the first j elements of which are equal, the vector with the larger<br />

(j + 1)-st element is always at least as good as the other.<br />

Lemma 4. Let N and Dmax be the total number of members and the upper bound on the maximum<br />

authentication delay, respectively. Moreover, let B = (b1, b2, . . . , bℓ) and B ′ = (b ′ 1, b ′ 2, . . . , b ′ ℓ ′)<br />

be two sorted branching factor vectors such that bi = b ′ i for all 1 ≤ i ≤ j for some j < min(ℓ, ℓ′ ),<br />

and both B and B ′ satisfy the constraints of the optimization problem defined above. Then,<br />

bj+1 > b ′ j+1 implies R(B) ≥ R(B′ ).<br />

17


2. PRIVATE AUTHENTICATION<br />

Proof. By definition<br />

R(B) = 1<br />

N 2<br />

=<br />

=<br />

⎛<br />

⎝1 + (bℓ − 1) 2 ∑ℓ−1<br />

+ (bi − 1) 2<br />

( b1 − 1<br />

b1<br />

( b1 − 1<br />

b1<br />

) 2<br />

) 2<br />

where B1 = (b2, b3, . . . , bℓ). Similarly,<br />

i=1<br />

+ 1<br />

b2 ⎛<br />

⎝<br />

1<br />

1<br />

(N/b1) 2<br />

+ 1<br />

b2 · R(B1)<br />

1<br />

R(B ′ ( ′ b 1 − 1<br />

) =<br />

b ′ 1<br />

ℓ∏<br />

j=i+1<br />

b 2 j<br />

⎞<br />

⎠<br />

⎛<br />

⎝1 + (bℓ − 1) 2 ∑ℓ−1<br />

+ (bi − 1) 2<br />

) 2<br />

+ 1<br />

b ′ · R(B 2<br />

1<br />

′ 1)<br />

i=2<br />

ℓ∏<br />

j=i+1<br />

where B ′ 1 = (b ′ 2, b ′ 3, . . . , b ′ ℓ ′). Since b1 = b ′ 1, R(B) ≥ R(B ′ ) if and only if R(B1) ≥ R(B ′ 1). By<br />

repeating the same argument for B1 and B ′ 1, we get that R(B) ≥ R(B ′ ) if and only if R(B2) ≥<br />

R(B ′ 2), where B2 = (b3, . . . , bℓ) and B ′ 2 = (b ′ 3, . . . , b ′ ℓ ′). And so on, until we get that R(B) ≥ R(B′ )<br />

if and only if R(Bj) ≥ R(B ′ j ), where Bj = (bj+1, . . . , bℓ) and B ′ j = (b′ j+1 , . . . , b′ ℓ ′). But from<br />

, and we are done. ⋄<br />

Lemma 3, we know that R(Bj) ≥ R(B ′ j ) if bj+1 > b ′ j+1<br />

I will now present an algorithm that finds the solution to the optimization problem. However,<br />

before doing that, we need to introduce some further notations. Let B = (b1, b2, . . . , bℓ) and<br />

B ′ = (b ′ 1, b ′ 2, . . . , b ′ ℓ ′). Then<br />

∏ ∏ℓ (B) denotes i=1 bi;<br />

∑ ∑ℓ (B) denotes i=1 bi;<br />

{B} denotes the set {b1, b2, . . . , bℓ} of the elements of B;<br />

B ′ ⊆ B means that {B ′ } ⊆ {B};<br />

b 2 j<br />

⎞⎞<br />

⎠⎠<br />

if B ′ ⊆ B, then B \ B ′ denotes the vector that consists of the elements of {B} \ {B ′ } in<br />

decreasing order;<br />

if b is a positive integer, then b|B denotes the vector (b, b1, b2, . . . , bℓ).<br />

The algorithm is defined as a recursive function f, which takes two input parameters, a vector<br />

B of positive integers, and another positive integer d, and returns a vector of positive integers. In<br />

order to compute the optimal branching factor vector for a given N and Dmax, f should be called<br />

with the vector that contains the prime factors of N, and Dmax. For instance, if N = 27000 and<br />

Dmax = 90 (the same parameters are used as in the example in Sec 2.2, to compare the naïve<br />

and algorithmical results), then f should be called with B = (5, 5, 5, 3, 3, 3, 2, 2, 2) and d = 90.<br />

Function f will then return the optimal branching factor vector.<br />

Function f is defined Algorithm 1.<br />

The operation of the algorithm can be described as follows: The algorithm starts with a branching<br />

factor vector consisting of the prime factors of N. This vector satisfies the first constraint of<br />

the optimization problem by definition. If it does not satisfy the second constraint (i.e., it does not<br />

respect the upper bound on the maximum authentication delay), then no solution exists. Otherwise,<br />

the algorithm successively improves the branching factor vector by maximizing its elements,<br />

starting with the first element, and then proceeding to the next elements, one after the other. Maximization<br />

of an element is done by joining as yet unused prime factors until the resulting divisor<br />

of N cannot be further increased without violating the constraints of the optimization problem.<br />

18


Algorithm 1 Optimal branching factor generating algorithm<br />

f(B, d)<br />

if ∑ (B) > d then<br />

exit (no solution exists)<br />

else<br />

find B ′ ⊆ B such that<br />

∏ (B ′ ) + ∑ (B \ B ′ ) ≤ d and ∏ (B ′ ) is maximal<br />

end if<br />

if B ′ = B then<br />

return ( ∏ (B ′ ))<br />

else<br />

return ∏ (B ′ )|f(B \ B ′ , d − ∏ (B ′ ))<br />

end if<br />

2.3. Optimal trees in case of single member compromise<br />

Theorem 2. Let N and Dmax be the total number of members and the upper bound on the<br />

maximum authentication delay, respectively. Moreover, let B be a vector that contains the prime<br />

factors of N. Then, f(B, Dmax) is an optimal branching factor vector for N and Dmax.<br />

Proof. I will give a sketch of the proof. Let B ∗ = f(B, Dmax), and let us assume that there is<br />

another branching factor vector B ′ ̸= B ∗ that also satisfies the constraints of the optimization<br />

problem and R(B ′ ) > R(B ∗ ). I will show that this leads to a contradiction, hence B ∗ should be<br />

optimal.<br />

Let B ∗ = (b ∗ 1, b ∗ 2, . . . , b ∗ ℓ ∗) and B′ = (b ′ 1, b ′ 2, . . . , b ′ ℓ ′). Recall that B∗ is obtained by first maximizing<br />

the first element in the vector, therefore, b ∗ 1 ≥ b ′ 1 must hold. If b ∗ 1 > b ′ 1, then R(B ∗ ) ≥ R(B ′ )<br />

by Lemma 3, and thus, B ′ cannot be a better vector than B ∗ . This means that b ∗ 1 = b ′ 1 must hold.<br />

We know that once b ∗ 1 is determined, the algorithm continues by maximizing the next element<br />

of B ∗ . Hence, b ∗ 2 ≥ b ′ 2 must hold. If b ∗ 2 > b ′ 2, then R(B ∗ ) ≥ R(B ′ ) by Lemma 4, and thus, B ′<br />

cannot be a better vector than B ∗ . This means that b ∗ 2 = b ′ 2 must hold too.<br />

By repeating this argument, finally, we arrive to the conclusion that B ∗ = B ′ must hold, which<br />

is a contradiction. ⋄<br />

Table 2.1 illustrates the operation of the algorithm for B = (5, 5, 5, 3, 3, 3, 2, 2, 2) and d = 90.<br />

The rows of the table correspond to the levels of the recursion during the execution. The column<br />

labeled with B ′ contains the prime factors that are joined at a given recursion level. The optimal<br />

branching factor vector can be read out from the last column of the table (each row contains one<br />

element of the vector). From this example, we can see that the optimal branching factor vector<br />

for N = 27000 and Dmax = 90 is B ∗ = (72, 5, 5, 5, 3). For the key-tree defined by this vector, we<br />

get R ≈ 0.9725, and D = 90.<br />

Table 2.1: Illustration of the operation of the recursive function f when called with B =<br />

(5, 5, 5, 3, 3, 3, 2, 2, 2) and d = 90. The rows of the table correspond to the levels of the recursion<br />

during the execution.<br />

recursion level B d B ′ ∏ (B ′ )<br />

1 (5, 5, 5, 3, 3, 3, 2, 2, 2) 90 (3, 3, 2, 2, 2) 72<br />

2 (5, 5, 5, 3) 18 (5) 5<br />

3 (5, 5, 3) 13 (5) 5<br />

4 (5, 3) 8 (5) 5<br />

5 (3) 3 (3) 3<br />

19


2. PRIVATE AUTHENTICATION<br />

2.4 Analysis of the general case<br />

So far, we have studied the case of a single compromised member. This already proved to be useful,<br />

because it allowed us to compare different key-trees and to derive a key-tree construction method.<br />

However, one may still be interested in what level of privacy is provided by a system in the general<br />

case when any number of members could be compromised. In this section, I address this problem.<br />

P <br />

<br />

<br />

<br />

P <br />

Figure 2.3: Illustration of what happens when several members are compromised. Just as in the<br />

case of a single compromised member, the members are partitioned into anonymity sets, but now<br />

the resulting subsets depend on the number of the compromised members, as well as on their<br />

positions in the tree. Nevertheless, the expected size of the anonymity set of a randomly selected<br />

member is still a good metric for the level of privacy provided by the system, although, in this<br />

general case, it is more difficult to compute.<br />

In what follows, we will need to refer to the non-leaf vertices of the key-tree, and for this reason,<br />

I introduce the labelling scheme that is illustrated in Figure 2.3. In addition, we need to introduce<br />

some further notations. I call a leaf compromised if it belongs to a compromised member, and I<br />

call a non-leaf vertex compromised if it lies on a path that leads to a compromised leaf in the tree.<br />

If vertex v is compromised, then<br />

Kv denotes the set of the compromised children of v, and kv = |Kv|;<br />

Pv denotes the set of subsets (anonymity sets) that belong to the subtree rooted at v (see<br />

Figure 2.3 for illustration); and<br />

¯ Sv denotes the average size of the subsets in Pv.<br />

We are interested in computing ¯ S ⟨−⟩. We can do that as follows:<br />

¯S ⟨−⟩ = ∑<br />

P ∈P ⟨−⟩<br />

|P | 2<br />

b1b2 . . . bℓ<br />

= ((b1 − k ⟨−⟩)b2 . . . bℓ) 2<br />

+<br />

b1b2 . . . bℓ<br />

∑<br />

= ((b1 − k ⟨−⟩)b2 . . . bℓ) 2<br />

b1b2 . . . bℓ<br />

∑<br />

v∈K ⟨−⟩ P ∈Pv<br />

+ 1<br />

b1<br />

∑<br />

v∈K ⟨−⟩<br />

In general, for any vertex ⟨i1, . . . , ij⟩ such that 1 ≤ j < ℓ − 1:<br />

¯S ⟨i1,...,ij⟩ = ((bj+1 − k ⟨i1,...,ij⟩)bj+2 . . . bℓ) 2<br />

bj+1 . . . bℓ<br />

20<br />

¯Sv<br />

+ 1<br />

bj+1<br />

|P | 2<br />

b1b2 . . . bℓ<br />

∑<br />

v∈K ⟨i1 ,...,i j ⟩<br />

¯Sv<br />

(2.9)<br />

(2.10)


Finally, for vertices ⟨i1, . . . , iℓ−1⟩ just above the leaves, we get:<br />

¯S ⟨i1,...,iℓ−1⟩ = (bℓ − k ⟨i1,...,iℓ−1⟩) 2<br />

bℓ<br />

2.4. Analysis of the general case<br />

+ k ⟨i1,...,iℓ−1⟩<br />

bℓ<br />

(2.11)<br />

Expressions (2.9 – 2.11) can be used to compute the expected anonymity set size in the system<br />

iteratively, in case of any number of compromised members. However, note that the computation<br />

depends not only on the number c of the compromised members, but also their positions in the tree.<br />

This makes the comparison of different systems difficult, because for a comprehensive analysis,<br />

all possible allocations of the compromised members over the leaves of the key-tree should be<br />

considered. Therefore, such a formula is preferred that depends solely on c, but characterizes the<br />

effect of compromised members on the level of privacy sufficiently well, so that it can serve as a<br />

basis for comparison of different systems. In the following, such a formula is derived based on the<br />

assumption that the compromised members are distributed uniformly at random over the leaves of<br />

the key-tree. In some sense, this is a pessimistic assumption as the uniform distribution represents<br />

the worst case, which leads to the largest amount of privacy loss due to the compromised members.<br />

Thus, the approximation that is derived can be viewed as a lower bound on the expected anonymity<br />

set size in the system when c members are compromised.<br />

Let the branching factor of the key-tree be B = (b1, b2, . . . , bℓ), and let c be the number of<br />

compromised leaves in the tree. We can estimate k ⟨−⟩ for the root as follows:<br />

k ⟨−⟩ ≈ min(c, b1) = k0<br />

(2.12)<br />

If a vertex ⟨i⟩ at the first level of the tree is compromised, then the number of compromised<br />

leaves in the subtree rooted at ⟨i⟩ is approximately c/k0 = c1. Then, we can estimate k ⟨i⟩ as<br />

follows:<br />

k ⟨i⟩ ≈ min(c1, b2) = k1<br />

(2.13)<br />

In general, if vertex ⟨i1, . . . , ij⟩ at the j-th level of the tree is compromised, then the number<br />

of compromised leaves in the subtree rooted at ⟨i1, . . . , ij⟩ is approximately cj−1/kj−1 = cj, and<br />

we can use this to approximate k ⟨i1,...,ij⟩ as follows:<br />

k ⟨i1,...,ij⟩ ≈ min(cj, bj+1) = kj<br />

(2.14)<br />

Using these approximations in expressions (2.9 – 2.11), we can derive an approximation for<br />

¯S ⟨−⟩, which is denoted by ¯ S0, in the following way:<br />

¯Sℓ−1 = (bℓ − kℓ−1) 2<br />

. . . . . .<br />

bℓ<br />

+ kℓ−1<br />

bℓ<br />

¯Sj = ((bj+1 − kj)bj+2 . . . bℓ) 2<br />

. . . . . .<br />

bj+1 . . . bℓ<br />

¯S0 = ((b1 − k0)b2 . . . bℓ) 2<br />

b1 . . . bℓ<br />

+ k0 ¯S1<br />

b1<br />

+ kj ¯Sj+1<br />

bj+1<br />

(2.15)<br />

(2.16)<br />

(2.17)<br />

Note that expressions (2.17 – 2.15) do not depend on the positions of the compromised leaves<br />

in the tree, but they depend only on the value of c.<br />

In order to see how well ¯ S0 estimates ¯ S ⟨−⟩, some simulations are run. The simulation parameters<br />

are the following:<br />

total number of members N = 27000;<br />

upper bound on the maximum authentication delay Dmax = 90;<br />

Two branching factor vectors are considered: (30, 30, 30) and (72, 5, 5, 5, 3);<br />

The number c of compromised members is varied between 1 and 270 with a step size of one.<br />

21


2. PRIVATE AUTHENTICATION<br />

For each value of c, I run 100 simulations 4 . In each simulation run, the c compromised members<br />

were chosen uniformly at random from the set of all members. The exact value of the normalized<br />

expected anonymity set size ¯ S ⟨−⟩/N is computed using the expressions (2.9 – 2.11). Finally, the<br />

obtained values are averaged over all simulation runs. Moreover, for every c, I also computed the<br />

estimated value ¯ S0/N using the expressions (2.15 – 2.17).<br />

The simulation results are shown in Figure 2.4. The figure does not show the confidence<br />

interwalls, because they are very small (in the range of 10 −4 for all simulations) and thus they<br />

could be hardly visible. As we can see, ¯ S0/N approximates ¯ S ⟨−⟩/N quite well, and in general it<br />

provides a lower bound on the normalized expected anonymity set size.<br />

Normalized average anonymity set size<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

Simulation result for (S /N)<br />

Approximation (S 0 /N)<br />

0<br />

0 50 100 150 200 250 300<br />

Number of compromised members (c)<br />

Normalized average anonimity set size<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

Simulation result for (S /N)<br />

Approximation (S 0 /N)<br />

0<br />

0 50 100 150 200 250 300<br />

Number of compromised members (c)<br />

Figure 2.4: Simulation results for branching factor vectors (30, 30, 30) (left hand side) and<br />

(72, 5, 5, 5, 3) (right hand side). As we can see, ¯ S0/N approximates ¯ S ⟨−⟩/N quite well, and in<br />

general it provides a lower bound on it.<br />

In Figure 2.5, the value of ¯ S0/N is plotted as a function of c for different branching factor<br />

vectors. This figure illustrates, how different systems can be compared using the approximation<br />

¯S0/N of the normalized expected anonymity set size. On the left hand side of the figure, we<br />

can see that the value of ¯ S0/N is greater for the vector B ∗ = (72, 5, 5, 5, 3) than for the vector<br />

B = (30, 30, 30) not only for c = 1 (as we saw before), but for larger values of c too. In fact, B ∗<br />

seems to lose its superiority only when the value of c approaches 60, but at this range, the systems<br />

nearly provide no privacy in any case. Thus, we can conclude that B ∗ is a better branching factor<br />

vector yielding more privacy than B in general.<br />

We can make another interesting observation on the left hand side of Figure 2.5: ¯ S0/N starts<br />

decreasing sharply as c starts increasing, however, when c gets close to the value of the first element<br />

of the branching factor vector, the decrease of ¯ S0/N slows down. Moreover, almost exactly when<br />

c reaches the value of the first element (30 in case of B, and 72 in case of B ∗ ), ¯ S0/N seems<br />

to turn into constant, but at a very low value. We can conclude that, just as in the case of a<br />

single compromised member, in the general case too, the level of privacy provided by the system<br />

essentially depends on the value of the first element of the branching factor vector. The plot on the<br />

right hand side of the figure reinforces this observation: it shows ¯ S0/N for two branching factor<br />

vectors that have the same first element but that differ in the other elements. As we can see, the<br />

curves are almost perfectly overlapping.<br />

Thus, a practical design principle for key-tree based private authentication systems is to maximize<br />

the branching factor at the first level of the key-tree. Further optimization by adjusting the<br />

branching factors of the lower levels may still be possible, but the gain is not significant; what<br />

really counts is the branching factor at the first level.<br />

4 All computations have been done in Matlab, and for the purpose of repeatability, the source code is available<br />

on-line at http://www.crysys.hu/∼holczer/PET2006<br />

22


Estimated normalised average anonimity set size (S 0 /N)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

B = [72 5 5 5 3]<br />

B = [30 30 30]<br />

0<br />

0 20 40 60 80 100<br />

Number of compromised members (c)<br />

Estimated normalised average anonimity set size (S 0 /N)<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

2.5. The group-based approach<br />

B = [60 30 15]<br />

B = [60 5 5 3 3 2]<br />

0<br />

0 20 40 60 80 100<br />

Number of compromised members (c)<br />

Figure 2.5: The value of ¯ S0/N as a function of c for different branching factor vectors. The figure<br />

illustrates, how different systems can be compared based on the approximation ¯ S0/N. On the left<br />

hand side, we can see that the value of ¯ S0/N is greater for the vector (72, 5, 5, 5, 3) than for the<br />

vector (30, 30, 30) not only for c = 1 (as we saw earlier), but for larger values of c too. On the<br />

right hand side, we can see that ¯ S0/N is almost the same for the vector (60, 5, 5, 3, 3, 2) as for the<br />

vector (60, 30, 15). We can conclude that ¯ S0/N is essentially determined by the value of the first<br />

element of the branching factor vector.<br />

2.5 The group-based approach<br />

In the group based authentication scheme, the set of all tags is divided into groups of equal size,<br />

and all tags of a given group share a common group key. Since the group keys do not enable<br />

the reader to identify the tags uniquely, every tag also stores a unique identifier. Keys are secret<br />

(each group key is known only to the reader and the members of the corresponding group), but<br />

identifiers can be public. To avoid impersonation of a tag from the same group, every tag has a<br />

unique secret key as well. This key is only shared between the tag and the reader. To reduce the<br />

storage demands on the reader side, the pairwise key can be generated from a master key using<br />

the identifier of the tag.<br />

In order to authenticate a tag, the reader sends a single challenge to the tag. The answer of the<br />

tag has two parts. In the first part, the tag answers to the reader by encrypting with the group key<br />

the reader’s challenge concatenated with a nonce picked by the tag, and the tag’s identifier. In the<br />

second part, the tag encrypts the challenge concatenated with the nonce using its own secret key.<br />

Encrypting the identifier is needed since the key used for encryption does not identify uniquely the<br />

tag. Upon reception of the answer, the reader identifies the tag by trying all the group keys until<br />

the decryption succeeds. Then it checks the second part, that it was encrypted by the same tag.<br />

Without the second part, every tag could impersonate every other tag in the same group.<br />

The operation of the group-based private authentication scheme is illustrated in Figure 2.6.<br />

The complexity of the group-based scheme for the reader depends on the number of the groups.<br />

In particular, if there are γ groups, then, in the worst case, the reader must try γ keys. Therefore,<br />

if the upper bound on the worst case complexity is given as a design parameter, then γ is easily<br />

determined. For example, to get the same complexity as in the key-tree based scheme with constant<br />

branching factor, one may choose γ = (b log b N) − 1, where N is the total number of tags and b is<br />

the branching factor of the key-tree. The minus one indicates the decryption of the second part of<br />

the message.<br />

An immediate advantage of the group-based scheme with respect to the key-tree based approach<br />

is that the tags need to store only two keys and an identifier. In contrast to this, in the key-tree<br />

based scheme, the number of keys stored by the tags depends on the depth of the tree. For instance,<br />

in the case of the Molnar-Wagner scheme, the tags must store log b N keys. Moreover, by using<br />

only two keys, this scheme also has a smaller complexity for the tag in terms of computation and<br />

communication.<br />

Besides its advantages with respect to complexity, the group-based scheme provides a higher<br />

23


2. PRIVATE AUTHENTICATION<br />

Reader R Tag T<br />

Pick R1<br />

Try all group keys<br />

until K is found<br />

Check ID's own key<br />

R1<br />

EK(R1|R2|ID) EKID (R1|R2)<br />

Pick R2<br />

Figure 2.6: Operation of the group-based private authentication scheme. K is the group key stored<br />

by the tag,KID is the tag’s own secret key, ID is the identifier of the tag, R1 and R2 are random<br />

values generated by the reader and the tag, respectively, | denotes concatenation, and EK() denotes<br />

symmetric-key encryption with K.<br />

k1<br />

k1,1 k1,2 k2,1 k2,2<br />

k1,1,1 k1,1,2 k1,2,1 k1,2,2 k2,1,1 k2,1,2 k2,2,1 k2,2,2<br />

k2<br />

K1 K2 K3 K4<br />

KID1 KID2 KID3 KID4 KID5 KID6 KID7 KID8<br />

Figure 2.7: On the left hand side: The tree-based authentication protocol uses a tree, where<br />

the tags correspond to the leaves of the tree. Each tag stores the keys along the path from the<br />

root to the leaf corresponding to the given tag. When authenticating itself, a tag uses all of its<br />

keys. The reader identifies which keys have been used by iteratively searching through the keys<br />

at the successive levels of the tree. On the right hand side: In the group-based authentication<br />

protocol, the tags are divided into groups. Each tag stores its group key and its own key. When<br />

authenticating itself, a tag uses its group key first, and then its own key. The reader identifies<br />

which group key has been used by trying all group keys, then it checks the tags own key.<br />

level of privacy than the key-tree based scheme when some of the tags are compromised. I will<br />

show this in Section 2.7.<br />

2.6 Analysis of the group based approach<br />

The metric proposed in Section 2.2 is based on the observation that when some tags are compromised,<br />

the set of all tags become partitioned such that the adversary cannot distinguish the tags<br />

that belong to the same subset, but she can distinguish the tags that belong to different subsets.<br />

Hence, the subsets are the anonymity sets of their members. The level R of privacy provided by the<br />

scheme is then characterized as the average anonymity set size normalized with the total number<br />

N of the tags. Formally,<br />

R = 1<br />

N<br />

∑<br />

i<br />

|Pi| |Pi|<br />

N<br />

= 1<br />

N 2<br />

∑<br />

i<br />

|Pi| 2<br />

(2.18)<br />

where |Pi| denotes the size of subset Pi and |Pi|/N is the probability that a randomly chosen tag<br />

belongs to subset Pi.<br />

In the group-based scheme, a similar kind of partitioning can be observed when tags become<br />

compromised. In particular, when a single tag is compromised, the adversary learns the group<br />

key of that tag, which allows her to distinguish the tags within this group from each other (since<br />

24


2.6. Analysis of the group based approach<br />

the tags use their identifiers in the protocol) and from the rest of the tags in the system. This<br />

means that each member of the compromised group forms an anonymity set of size 1, and the<br />

remaining tags form another anonymity set. In general, when more tags are compromised, we can<br />

observe that the partitioning depends on the number C of the compromised groups, where a group<br />

is compromised if at least one tag that belongs to that group is compromised. More precisely,<br />

when C groups are compromised, we get nC anonymity sets of size 1 and an anonymity set of size<br />

n(γ − C), where γ is the number of groups and n = N/γ is the size of a group. This results in the<br />

following expression for the level R of the privacy according to the metric (2.18):<br />

R = 1<br />

N 2<br />

( 2<br />

nC + (n(γ − C)) )<br />

(2.19)<br />

If tags are compromised randomly, then C, and hence, R are random variables, and the level of<br />

privacy provided by the system is characterized by the expected value of R. In order to compute<br />

that, we must compute the expected value of C and that of C 2 . This can be done as follows: let<br />

us denote by Ai the event that at least one tag from the i-th group is compromised, and let IAi<br />

be Ai’s indicator function. The probability of Ai can be calculated as follows:<br />

(<br />

N − n<br />

)<br />

P (Ai) = 1 −<br />

c<br />

(<br />

N<br />

c<br />

) = (2.20)<br />

c−1 ∏<br />

(<br />

= 1 − 1 − n<br />

)<br />

N − j<br />

(2.21)<br />

j=0<br />

The expected value of C is the expected value of the sum of the indicator functions:<br />

E [C] = E<br />

[ γ∑<br />

i=1<br />

IAi<br />

]<br />

j=0<br />

=<br />

γ∑<br />

P (Ai) = (2.22)<br />

i=1<br />

⎛<br />

c−1 ∏<br />

(<br />

= γ ⎝1 − 1 − n<br />

)<br />

N − j<br />

⎞<br />

⎠ (2.23)<br />

Similarly, the second moment of C can be computed as follows:<br />

= E<br />

[ γ∑<br />

i=1<br />

E [ C 2] = E<br />

IAi<br />

[ γ∑<br />

] ⎡<br />

+ E ⎣ ∑<br />

i̸=j<br />

i=1<br />

IAi<br />

IAi∩Aj<br />

] 2<br />

⎤<br />

= (2.24)<br />

⎦ = (2.25)<br />

= E [C] + ( γ 2 − γ ) P (Ai ∩ Aj) (2.26)<br />

Finally, probability P (Ai ∩ Aj) can be computed in the following way:<br />

P (Ai ∩ Aj) = (2.27)<br />

= 1 − P ( ) ( )<br />

Ai ∩ Aj − 2P Ai ∩ Aj<br />

(2.28)<br />

25


2. PRIVATE AUTHENTICATION<br />

P ( (<br />

N − 2n<br />

)<br />

)<br />

Ai ∩ Aj =<br />

c<br />

(<br />

N<br />

c<br />

) = (2.29)<br />

c−1 ∏<br />

(<br />

= 1 − 2n<br />

)<br />

N − j<br />

(2.30)<br />

j=0<br />

j=0<br />

P ( ) ( ) ( )<br />

Ai ∩ Aj = P Ai|Aj P Aj = (2.31)<br />

⎡<br />

c−1 ∏<br />

(<br />

)<br />

= ⎣1<br />

n<br />

− 1 −<br />

N − n − j<br />

⎤<br />

⎦ · (2.32)<br />

c−1 ∏<br />

(<br />

· 1 − n<br />

)<br />

N − j<br />

j=0<br />

(2.33)<br />

Based on the above formulae, the expected value of R is computed as a function of c for N = 2 14<br />

and γ = 64. The results are plotted on the left hand side of Figure 2.8. The same plot also contains<br />

the results of a Matlab simulation with the same parameters, where we chose the c compromised<br />

tags uniformly at random. For each value of c, 10 simulations are run, computed the exact values<br />

of the average anonymity set size using (2.19) directly, and averaged the results. As it can be<br />

seen in the figure, the analytical results match the results of the simulation. I performed the same<br />

verification for several other values of N and γ, and in each case, I obtained the same matching<br />

results.<br />

2.7 Comparison of the group and the key-tree based approach<br />

In this section, I compare the group-based scheme to the key-tree based scheme. The methodology<br />

is the following: for a given number N of tags and upper bound γ on the worst case complexity for<br />

the reader, I determine the optimal key-tree using the algorithm proposed in 2.3. Then, I compare<br />

the level of privacy provided by this optimal key-tree to that provided by the group-based scheme<br />

with γ groups and N tags.<br />

The comparison is performed by means of simulations. A simulation run consists in randomly<br />

choosing c compromised tags, and computing the resulting normalized average anonymity set size<br />

R for both the optimal key-tree and the group-based scheme. For the former, we can use the<br />

formulae (2.15 – 2.17), while for the latter, I use formula (2.19) directly. For each value of c, I run<br />

several simulation runs, and average the results.<br />

The simulation parameters were the following: for the number N of tags, only powers of 2<br />

are considered, because in practice, that number is related to the size of the identifier space, and<br />

identifiers are usually represented as binary strings. Thus, in the simulations, N = 2 x , and x is<br />

varied between 10 and 15 with a step size of 1. The values for the worst case complexity γ (which<br />

coincides with the number of groups in the group-based scheme) were 64, 128, and 256. Finally,<br />

the number c of compromised tags from 1 to 3γ is varied. For each combination of these values,<br />

100 simulation runs were performed.<br />

The right hand side of Figure 2.8 shows the results that we can obtain for N = 2 10 and γ = 64.<br />

The plots corresponding to the other simulation settings are not included, because they are very<br />

similar to the one in Figure 2.8. As we can see, the group-based scheme provides a higher level of<br />

privacy when the number of compromised tags does not exceed a threshold. Above the threshold,<br />

the key-tree based scheme becomes better, however, in this region, both schemes provide virtually<br />

26


Level of privacy (R)<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

Simulation result<br />

Formal result<br />

0<br />

0 50 100 150 200<br />

Number of compromised members (c)<br />

Level of privacy(R)<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

2.8. Related work<br />

Tree based authentication<br />

Group based authentication<br />

0<br />

0 50 100 150 200<br />

Number of compromised members (c)<br />

Figure 2.8: On the left hand side: The analytical results obtained for the expected value of R<br />

match the averaged results of ten simulations. The parameters are: N = 2 14 and γ = 64. On the<br />

right hand side: Results of the simulation aiming at comparing the key-tree based scheme and<br />

the group-based scheme. The curves show the level R of privacy as a function of the number c of<br />

the compromised tags. The parameters are: N = 2 10 and γ = 64. The confidence intervals are<br />

not shown, because they are in the range of 10 −3 , and therefore, they would be hardly visible. As<br />

we can see, the group-based scheme achieves a higher level of privacy when c is below a threshold.<br />

Above the threshold, the key-tree based approach is slightly better, however, in this region, both<br />

schemes provide virtually no privacy.<br />

no privacy. Thus, for any practical purposes, the group-based scheme is better than the key-tree<br />

based scheme (even if optimal key-trees are used).<br />

2.8 Related work<br />

The problem of private authentication has been extensively studied in the literature recently, but<br />

most of the proposed solutions are based on public key cryptography. One example is Idemix, which<br />

is a practical anonymous credential system proposed by Camenisch and Lysyanskaya in [Camenisch<br />

and Lysyanskaya, 2001]. Idemix allows for unlinkable demonstration of the possession of various<br />

credentials, and it can be used in many applications. However, it is not applicable in resource<br />

constraint scenarios, such as low-cost RFID systems. For such applications, solutions based on<br />

symmetric key cryptography seem to be the only viable options. A comprehensive bibliography<br />

of RFID related privacy problems is maintained by Avoine in [Avoine, 2012]. A recent survey of<br />

RFID privacy approaches is published by Langheinrich [Langheinrich, 2009], where he overviews<br />

60 papers in this field. Another important paper by Syamsuddin et al. survey the hash chain<br />

based RFID authentication protocols in [Syamsuddin et al., 2008]. In the following I try to focus<br />

on the methods similar to the ones described in this thesis, and encourage the reader to read the<br />

aforementioned surveys for a broader view.<br />

The key-tree based approach for symmetric key private authentication has been proposed by<br />

Molnar and Wagner in [Molnar and Wagner, 2004]. However, they use a simple b-ary tree, which<br />

means that the tree has the same branching factor at every level. Moreover, they do not analyze<br />

the effects of compromised members on the level of privacy provided. They only mention that<br />

compromise of a member has a wider effect than in the case of public key cryptography based<br />

solutions.<br />

An entropy based analysis of key trees can be found in [Nohara et al., 2005]. Nohara et al.<br />

prove that their K-steps ID matching scheme (which is very similar to [Molnar and Wagner, 2004])<br />

is secure against one compromised tag, if the number of nodes are large enough. They consider<br />

only b-ary trees, no variable branching factors.<br />

27


2. PRIVATE AUTHENTICATION<br />

Avoine et al. analyze the effects of compromised members on privacy in the key-tree based<br />

approach [Avoine et al., 2005]. They study the case of a single compromised member, as well as<br />

the general case of any compromised members. However, their analysis is not based on the notion<br />

of anonymity sets. In their model, the adversary is first allowed to compromise some members,<br />

then it chooses a target member that it wants to trace, and it is allowed to interact with the chosen<br />

member. Later, the adversary is given two members such that one of them is the target member<br />

chosen by the adversary. The adversary can interact with the given members, and it must decide<br />

which one is its target. The level of privacy provided by the system is quantified by the success<br />

probability of the adversary.<br />

Beye and Veugen goes a little further and analyze, what happens if the attacker has access<br />

to side channel information, and adapts the attack dynamically [Beye and Veugen, 2012]. They<br />

analyze the case of key trees described in this chapter as well and generalize the problem by setting<br />

only a minimum on N in [Beye and Veugen, 2011].<br />

2.9 Conclusion<br />

Key-trees provide an efficient solution for private authentication in the symmetric key setting.<br />

However, the level of privacy provided by key-tree based systems decreases considerably if some<br />

members are compromised. This loss of privacy can be minimized by the careful design of the<br />

tree. Based on the results presented in this chapter, we can conclude that a good practical design<br />

principle is to maximize the branching factor at the first level of the tree such that the resulting<br />

tree still respects the constraint on the maximum authentication delay in the system. Once the<br />

branching factor at the first level is maximized, the tree can be further optimized by maximizing<br />

the branching factors at the successive levels, but the improvement achieved in this way is not<br />

really significant; what really counts is the branching factor at the first level.<br />

In the second part of this chapter, I proposed a novel group based private authentication scheme.<br />

I analyzed the proposed scheme and quantified the level of privacy that it provides. I compared<br />

the group based scheme to the key-tree based scheme originally proposed by Molnar and Wagner,<br />

and later optimized by me in the first half of this chapter. I showed that the group based scheme<br />

provides a higher level of privacy than the key-tree based scheme. In addition, the complexity of<br />

the group based scheme for the verifier can be set to be the same as in the key-tree based scheme,<br />

while the complexity for the prover is always smaller in the latter scheme. The primary application<br />

area of the schemes are that of RFID systems, but it can also be used in applications with similar<br />

characteristics (e.g., in wireless sensor networks).<br />

2.10 Related publications<br />

[Buttyan et al., 2006a] Levente Buttyan, Tamas Holczer, and Istvan Vajda. Optimal key-trees<br />

for tree-based private authentication. In Proceedings of the International Workshop on Privacy<br />

Enhancing Technologies (PET), June 2006. Springer.<br />

[Buttyan et al., 2006b] Levente Buttyan, Tamas Holczer, and Istvan Vajda. Providing location<br />

privacy in automated fare collection systems. In Proceedings of the 15th IST Mobile and Wireless<br />

Communication Summit, Mykonos, Greece, June 2006.<br />

[Avoine et al., 2007] Gildas Avoine, Levente Buttyan, Tamas Holczer, and Istvan Vajda. Groupbased<br />

private authentication. In Proceedings of the International Workshop on Trust, Security,<br />

and Privacy for Ubiquitous Computing (TSPUC 2007). IEEE, 2007.<br />

28


Chapter 3<br />

Location Privacy in Vehicular Ad Hoc<br />

Networks<br />

3.1 Introduction<br />

In this chapter, I investigate what level of privacy a driver can achieve in Vehicular Ad Hoc<br />

Networks (VANET). More specifically, in the first half of this chapter, I investigate how can a local<br />

eavesdropping attacker trace the vehicles based on their frequently sent status information. In the<br />

second half of this chapter (from Section 3.4), I go a little further in terms of strength of attacker,<br />

and check what can a global eavesdropping attacker do. After realizing its broad capabilities, I<br />

suggest an algorithm, which can greatly reduce the attackers success rate.<br />

Recently, initiatives to create safer and more efficient driving conditions have begun to draw<br />

strong support in Europe [COM], in the US [VSC], and in Japan [ASV]. Vehicular communications<br />

will play a central role in this effort, enabling a variety of applications for safety, traffic<br />

efficiency, driver assistance, and entertainment. However, besides the expected benefits, vehicular<br />

communications also have some potential drawbacks. In particular, many envisioned safety related<br />

applications require that the vehicles continuously broadcast their current position and speed in<br />

so called heart beat messages. This allows the vehicles to predict the movement of other nearby<br />

vehicles and to warn the drivers if a hazardous situation is about to occur. While this can certainly<br />

be advantageous, an undesirable side effect is that it makes it easier to track the physical location<br />

of the vehicles just by eavesdropping these heart beat messages.<br />

One approach to solve this problem is that the vehicles broadcast their messages under pseudonyms<br />

that they change with some frequency [Raya and Hubaux, 2005]. The change of a pseudonym<br />

means that the vehicle changes all of its physical and logical addresses at the same time. Indeed, in<br />

most of the applications, the important thing is to let other vehicles know that there is a vehicle at<br />

a given position moving with a given speed, but it is not really important which particular vehicle<br />

it is. Thus, using pseudonyms is just as good as using real identifiers as far as the functionality of<br />

the applications is concerned. Obviously, these pseudonyms must be generated in such a way that<br />

a new pseudonym cannot be directly linked to previously used pseudonyms of the same vehicle.<br />

Unfortunately, only changing pseudonyms is largely ineffective against a global eavesdropper<br />

that can hear all communications in the network. Such an adversary can predict the movement of<br />

the vehicles based on the position and speed information in the heart beat messages, and use this<br />

prediction to link different pseudonyms of the same vehicle together with high probability. For<br />

instance, if at time t, a given vehicle is at position ⃗p and moves with speed ⃗v, then after some short<br />

time τ, this vehicle will most probably be at position ⃗p + τ · ⃗v. Therefore, the adversary will know<br />

that the vehicle that reports itself at (or near to) position ⃗p + τ ·⃗v at time t + τ is the same vehicle<br />

as the one that reported itself at position ⃗p at time t, even if in the meantime, the vehicle changed<br />

29


3. LOCATION PRIVACY IN VANETS<br />

pseudonym. This problem can be solved with some silent periods. This is discussed in the second<br />

part of this chapter (from Section 3.4).<br />

On the other hand, the assumption that the adversary can eavesdrop all communications in the<br />

network is a very strong one. In many situations, it is more reasonable to assume that the adversary<br />

can monitor the communications only at a limited number of places and only in a limited range.<br />

In this case, if a vehicle changes its pseudonym within the non-monitored area, then there is a<br />

chance that the adversary loses its trace. My goal in the first half of the chapter is to characterize<br />

this chance as a function of the strength of the adversary (i.e., its monitoring capabilities). In<br />

the second part of the chapter, I assume a relatively small area, where a global eavesdropping is<br />

reasonable. I analyze what a global attacker can do, and suggest a simple algorithm to reduce the<br />

capabilities of a global attacker. In particular, my main contributions are the following:<br />

I define a model in which the effectiveness of changing pseudonyms can be studied. I emphasize<br />

that while changing pseudonyms has already been proposed in the literature as a<br />

countermeasure to track vehicles [Raya and Hubaux, 2005], to the best of my knowledge, the<br />

effectiveness of this method has never been investigated rigorously in this context. My model<br />

is based on the concept of the mix zone. This concept was first introduced in [Beresford and<br />

Stajano, 2003], but again, to the best of my knowledge, it has not been used in the context<br />

of vehicular networks so far. I characterize the tracking strategy of the adversary in the mix<br />

zone model, and I introduce a metric to quantify the level of privacy provided by the mix<br />

zone.<br />

I report on the results of an extensive simulation where I used my model to determine the<br />

level of privacy achieved in realistic scenarios. In particular, in my simulation, I used a rather<br />

complex road map, generated traffic with realistic parameters, and varied the strength of the<br />

adversary by varying the number of her monitoring points. As expected, my simulation<br />

results confirm that the level of privacy decreases as the strength of the adversary increases.<br />

However, in addition to this, my simulation results provide detailed information about the<br />

relationship between the strength of the adversary and the level of privacy achieved by<br />

changing pseudonyms.<br />

In Section 3.5, I provide a breakdown of the requirements that a system must address in order<br />

to provide privacy. The aim is to provide an analytical framework that future researchers<br />

can use to concisely state which aspects of privacy a new proposal does or does not address.<br />

In Section 3.6, I propose an approach for implementing mix zones that does neither require<br />

extensive RSU support nor complex communication between vehicles, and that does not<br />

endanger safety-of-life to any significant extent, while providing both syntactic mixing and<br />

semantic mixing (in the language of Section 3.5). To my knowledge, this is the first proposal<br />

that provides for semantic mixing while at the same time addressing the safety-of-life concerns<br />

that naturally arise when a vehicle tries to obscure its path. The key insights are simply that<br />

vehicles traveling at a low speed are less likely to cause fatal accidents, and that vehicles will<br />

be traveling at a low speed at natural mix-points such as signalled intersections. The main<br />

body of experimental work in Section 3.6 is therefore an investigation of the consequences<br />

for the untraceability of vehicles if they stop sending heartbeat messages when their speed<br />

drops below a certain threshold and change all their identifiers after such silent periods. I<br />

call my scheme SLOW, which stands for silence at low speeds. (I note that of course SLOW<br />

is not a full solution to untraceability, as it does not cover the safe use of silent periods at<br />

high speeds; other techniques will need to be used to give untraceability in this case).<br />

The organization of the chapter is the following: in Section 3.2, I introduce the mix zone<br />

model, I define the behavior of the adversary in this model, and I introduce my privacy metric.<br />

In Section 3.3, I describe my simulation setting and the simulation results for the mix zones. In<br />

Section 3.4 I introduce the global attacker scenario. Then I introducing my overall analytical<br />

framework in Section 3.5. Next, in Section 3.6, I introduce my attacker model and my proposed<br />

30


3.2. Model of local attacker and mix zone<br />

solution, and in Section 3.7, I present the results of my experiments showing that my approach<br />

does indeed make tracing of vehicles hard for the attacker, and that it is usable in the real world.<br />

Finally, I report on some related work in Section 3.8, and conclude the chapter in Section 3.9.<br />

3.2 Model of local attacker and mix zone<br />

3.2.1 The concept of the mix zone<br />

I consider a continuous part of a road network, such as a whole city or a district of a city. I assume<br />

that the adversary installed some radio receivers at certain points of the road network with which<br />

she can eavesdrop the communications of the vehicles, including their heart beat messages, in a<br />

limited range. On the other hand, outside the range of her radio receivers, the adversary cannot<br />

hear the communications of the vehicles.<br />

Thus, the road network is divided into two distinct regions: the observed zone and the unobserved<br />

zone. Physically, these zones may be scattered, possibly consisting of many observing<br />

spots and a large unobserved area, but logically, the scattered observing spots can be considered<br />

together as a single observed zone. This is illustrated on the left hand side of Figure 3.1.<br />

observation<br />

spots<br />

observed zone<br />

4<br />

1<br />

3<br />

2<br />

6<br />

5<br />

1<br />

mix zone<br />

2 3<br />

Figure 3.1: On the left hand side: The figure illustrates how a road network is divided into<br />

an observed and an unobserved zone in the model. In the figure, the observed zone is grey, and<br />

the unobserved zone is white. The unobserved zone functions as a mix zone, because the vehicles<br />

change pseudonyms and mix within this zone making it difficult for the adversary to track them.<br />

On the right hand side: The figure illustrates how the road network on the left can be abstracted<br />

as single mix zone with six ports.<br />

Note that the vehicles do not know where the adversary installed her radio receivers, or in<br />

other words, when they are in the observed zone. For this reason, we can assume that the vehicles<br />

continuously change their pseudonyms 1 . In this part of the chapter, we can abstract away the<br />

frequency of the pseudonym changes, and we can simply assume that it is high enough so that<br />

every vehicle surely changes pseudonym while in the unobserved zone. I intend to relax this<br />

assumption in my future work.<br />

Since the vehicles change pseudonyms while in the unobserved zone, that zone functions as a<br />

mix zone for vehicles (see the right hand side of Figure 3.1 for illustration). A mix zone [Beresford<br />

and Stajano, 2003; Beresford and Stajano, 2004] is similar to a mix node of a mix network [Chaum,<br />

1981], which changes the encoding and the order of messages in order to make it difficult for the<br />

adversary to link message senders and message receivers. In my case, the mix zone makes it<br />

difficult for the adversary to link the vehicles that emerge from the mix zone to those that entered<br />

it earlier. Thus, the mix zones makes it difficult to track vehicles. On the other hand, based on the<br />

observation that I made in the Introduction, I assume that the adversary can track the physical<br />

location of the vehicles while they are in the observed zone, despite the fact that they may change<br />

pseudonyms in that zone too.<br />

1 Otherwise, if the vehicles knew when they are in the unobserved zone, then it would be sufficient to change their<br />

pseudonyms only once while they are in the unobserved zone.<br />

31<br />

ports<br />

6<br />

5<br />

4


3. LOCATION PRIVACY IN VANETS<br />

Since the vehicles move on roads, they cannot cross the border between the mix zone and the<br />

observed zone at any arbitrary point. Instead, the vehicles cross the border where the roads cross<br />

it. We can model this by assuming that the mix zone has ports, and the vehicles can enter and<br />

exit the mix zone only via these ports. For instance, on the right hand side of Figure 3.1, the ports<br />

are numbered from 1 to 6.<br />

3.2.2 The model of the mix zone<br />

While the adversary cannot observe the vehicles within the mix zone, we can assume that she still<br />

has some knowledge about the mix zone. This knowledge is subsumed in a model that consists of a<br />

matrix Q = [qij] of size M × M, where M is the number of ports of the mix zone, and M 2 discrete<br />

probability density functions fij(t) (1 ≤ i, j ≤ M). qij is the conditional probability of exiting the<br />

mix zone at port j given that the entry point was port i. fij(t) describes the probability distribution<br />

of the delay when traversing the mix zone between port i and port j. We can assume that time<br />

is slotted, that is why fij(t) is a discrete function. I note here, that it is unlikely for an attacker<br />

to achieve such a comprehensive knowledge of the mix zone. However it is not impossible with<br />

comprehensive real world measurements to approximate the needed probabilities and functions. In<br />

the rest of the chapter, we can consider the worst case (as it is advisable in the field of security),<br />

the attacker knows the model of the mix zone.<br />

3.2.3 The operation of the adversary<br />

The adversary knows the model of the mix zone and she observes events, where an event is a<br />

pair consisting of a port (port number) and a time stamp (time slot number). There are entering<br />

events and exiting events corresponding to vehicles entering and exiting the mix zone, respectively.<br />

Naturally, an entering event consists of the port where the vehicle entered the mix zone, and the<br />

time when this happened. Similarly, an exiting event consists of the port where the vehicle left the<br />

mix zone, and the time when this happened.<br />

The general objective of the adversary is to relate exiting events to entering events. More<br />

specifically, in the model, the adversary picks a vehicle v in the observed zone and tracks its<br />

movement until it enters the mix zone. In the following, I denote the port at which v entered<br />

the mix zone by s. Then, the adversary observes the exiting events for a time T such that the<br />

probability that v leaves the mix zone before T is close to 1 (i.e., Pr{tout < T } = 1 − ϵ, where ϵ is a<br />

small number, typically, in the range of 0.005 − 0.01, and tout is the random variable denoting the<br />

time at which the selected vehicle v exits the mix zone). For each exiting vehicle v ′ , the adversary<br />

determines the probability that v ′ is the same as v. For this purpose, she uses her observations and<br />

the model of the mix zone. Finally, she decides which exiting vehicle corresponds to the selected<br />

vehicle v.<br />

The decision algorithm used by the adversary is intuitive and straightforward: the adversary<br />

knows that the selected vehicle v entered the mix zone at port s and in timeslot 0. For each exiting<br />

event k = (j, t) that the adversary observes afterwards, she can compute the probability pjt that<br />

k corresponds to the selected vehicle as pjt = qsjfsj(t) (i.e., the probability that v chooses port<br />

j as its exit port given that it entered the mix zone at port s multiplied by the probability that<br />

it covers the distance between ports s and j in time t). The adversary decides for the vehicle for<br />

which pjt is maximal. The adversary is successful if the decided vehicle is indeed v.<br />

Indeed, the above described decision algorithm realized the Bayesian decision (see the Section<br />

3.2.4 for more details). The importance of this fact is that the Bayesian decision minimizes<br />

the error probability, thus, it is in some sense the ideal decision algorithm for the adversary.<br />

3.2.4 Analysis of the adversary<br />

In this section, I show that the decision algorithm of the adversary described in Subsection 3.2.3<br />

realizes a Bayesian decision. The following notations are used:<br />

32


3.2. Model of local attacker and mix zone<br />

k is an index of a vector. Every port-timeslot pair can be mapped to such an index and<br />

k can be mapped back to a port-timeslot pair. Therefore indices and port-timeslot pairs<br />

are interchangeable, and in the following discussion, I always use the one which makes the<br />

presentation simpler.<br />

k ∈ 1 . . . M · T , where M is the number of ports, and T is the length of the attack measured<br />

in timeslots.<br />

C = [ck] is a vector, where ck is the number of cars leaving the mix zone at k during the<br />

attack.<br />

N is the number of cars leaving the mix zone before timeslot T (i.e., N = ∑ MT<br />

k=1 ck).<br />

ps(k) is the probability of the event that the target vehicle leaves the mix zone at k (port<br />

and time) conditioned on the event that it enters the zone at port s at time 0. The attacker<br />

exactly knows which port is s. Probability ps(k) can be computed as: ps(k) = qsjfsj(t),<br />

where port j and timeslot t correspond to index k.<br />

p(k) is the probability of the event that a vehicle leaves the mix zone at k (port and time).<br />

This distribution can be calculated from the input distribution and the transition probabilities:<br />

p(k) = ∑ M<br />

s=1 ps(k).<br />

Pr(k|C) is the conditional probability that the target vehicle left the mix zone at time and<br />

port defined by k, given that the attacker’s observation is vector C.<br />

We must determine for which k probability Pr(k|C) is maximal. Let us denote this k with k ∗ .<br />

The probability Pr(k|C) can be rewritten, using the Bayes rule:<br />

Then k ∗ can be computed as:<br />

k ∗ = arg max<br />

k<br />

Pr(k|C) = Pr(C|k)ps(k)<br />

Pr(C)<br />

Pr(C|k)ps(k)<br />

Pr(C)<br />

= arg max Pr(C|k)ps(k)<br />

k<br />

Pr(C|k) has a multinomial distribution with a condition that at least one vehicle (the target of<br />

the attacker) must leave the mix zone at k:<br />

Pr(C|k) =<br />

N!<br />

c1! . . . ck−1!(ck − 1)!ck+1! . . . cMT ! p(k)ck−1<br />

MT ∏<br />

j=1,j̸=k<br />

Pr(C|k) can be multiplied and divided by p(k)<br />

to simplify the equation:<br />

ck<br />

Pr(C|k) = ck<br />

⎛<br />

MT<br />

⎝<br />

N! ∏<br />

p(j)<br />

p(k) c1! . . . cMT !<br />

cj<br />

⎞<br />

⎠<br />

j=1<br />

p(j) cj<br />

where the bracketed part is a constant, which does not have any effect on the maximization, thus<br />

it can be omitted.<br />

k ∗ = arg max<br />

k<br />

ck<br />

p(k) ps(k) = arg max<br />

k<br />

ck<br />

p(k)N ps(k) = arg max<br />

k<br />

pk<br />

p(k) ps(k)<br />

where pk is the empirical distribution of k (i.e., pk = ck/N). If the number of vehicles in the<br />

mix zone is large enough, then pk<br />

p(k) ≈ 1. Thus correctness of the intuitive algorithm described in<br />

Subsection 3.2.3 holds:<br />

k ∗ = arg max ps(k)<br />

k<br />

33


3. LOCATION PRIVACY IN VANETS<br />

This means that if many vehicles are traveling in the mix zone, then the attacker must choose<br />

the vehicle with the highest ps(k) probability.<br />

3.2.5 The level of privacy provided by the mix zone<br />

There are various metrics to quantify the level of privacy provided by the mix zone (and the<br />

fact that the vehicles continuously change pseudonyms). A natural metric in the model is the<br />

success probability of the adversary when making her decision as described above. If the success<br />

probability is large, then the mix zone and changing pseudonyms are ineffective. On the other<br />

hand, if the success probability of the adversary is small, then tracking is difficult and the system<br />

ensures location privacy.<br />

We can note that the level of privacy is often measured using the anonymity set size as the<br />

metric [Chaum, 1988], however, in this case, this approach cannot be used. The problem is that<br />

as described above, with probability ϵ, the selected vehicle v is not in the set V of vehicles exiting<br />

the mix zone during the experiment of the adversary, and therefore, by definition, V cannot be the<br />

anonymity set for v. Although, the size of V could be used as a lower bound on the real anonymity<br />

set size, there is another problem with the anonymity set size as privacy metric. Namely, it is an<br />

appropriate privacy metric only if each member of the set is equally likely to be the target of the<br />

observation, however, as we will see in Section 3.3, this is not the case in my model.<br />

Obviously, the success probability of the adversary is very difficult to determine analytically<br />

due to the complexity of the model. Therefore, I ran simulations to determine its empirical value<br />

in realistic situations. The simulation setting and parameters, as well as the simulation results are<br />

described in the next section.<br />

3.3 Simulation of mix zone<br />

The purpose of the simulation is to get an estimation of the success probability of the attacker in<br />

realistic scenarios. In this section, I first describe the simulation settings, and then, I present the<br />

simulation results.<br />

3.3.1 Simulation settings<br />

The simulation was carried out in three main phases. In the first phase, I generated a realistic map,<br />

where the vehicles moved during the simulation. This map was generated by MOVE [Karnadi et<br />

al., 2005], a tool that allows the user to quickly generate realistic mobility models for vehicular<br />

network simulations. My map is illustrated in Figure 3.2. In fact, it is a simplified map of Budapest,<br />

the capital of Hungary, and it contains the main roads of the city. I believe that despite of the<br />

simplifications, this map is still complex enough to get realistic traffic scenarios.<br />

The second phase of the simulation was to generate the movement of the vehicles on the<br />

generated map. This was done by SUMO [Krajzewicz et al., 2002], which is an open source microtraffic<br />

simulator, developed by the Center for Applied Informatics (ZAIK) and the Institute of<br />

Transport Research at the German Aerospace Center. SUMO dumps the state of the simulation in<br />

every time step into files. This state dump contains the location and the velocity of every vehicle<br />

during the simulation.<br />

In the third phase of the simulation, I processed the state dump generated by SUMO, and<br />

simulated the adversary. This part of the simulation was written in Perl, because Perl scripts can<br />

easily process the XML files generated by SUMO. Note that for the purpose of repeatability, I<br />

made the source code available on-line at http://www.crysys.hu/∼holczer/ESAS07.<br />

I implemented the adversary as follows. First, I defined the observation spots (position and<br />

radius) of the adversary in a configuration file. Then, I let the adversary build her model of the mix<br />

zone (i.e., the complement of its observation spots) by allowing her to track the vehicles as if they<br />

do not change their pseudonyms. In effect, the adversary’s knowledge is represented by a set of two<br />

dimensional tables. Each table K (i) corresponds to a port i of the mix zone, and contains empirical<br />

34


3.3. Simulation of mix zone<br />

Figure 3.2: Simplified map of Budapest generated for the simulation.<br />

probabilities. More specifically, the entry K (i)<br />

jt of table K(i) contains the empirical probability that<br />

a vehicle exits the mix zone at port j in time t given that it entered the mix zone at port i at time<br />

0. The size of the tables is M × T , where M is the number of the ports of the mix zone and T is<br />

the duration of the learning procedure defined as the time until which every observed vehicle left<br />

the mix zone. Once the adversary’s knowledge is built, she could use that for making decisions as<br />

described above in Section 3.2. I executed several simulation runs in order to get an estimation<br />

for the success probability of the adversary.<br />

Experiments with adversaries of different strength are made, where the strength of the adversary<br />

depends on the number of her eavesdropping receivers. In the simulations, all receivers were<br />

deployed in the middle of the junctions of the roads. The eavesdropping radius of the receivers was<br />

set to 50 meter. The number of the receivers varied between 5 and 59 with a step size of 5 (note<br />

that the map contains 59 junctions). Always the junctions with the highest traffic was chosen as<br />

the observation spots of the adversary (for instance, when the adversary had ten receivers, I chose<br />

the first ten junctions with the largest traffic).<br />

In addition to the strength of the adversary, the intensity of the traffic is varied. More specifically,<br />

I simulated three types of traffic: low, medium, and high. Low traffic means that in each<br />

time step 250 vehicles are emitted into the traffic flow, medium traffic is defined as 500 vehicles<br />

are emitted into the flow, and in case of high traffic 750 vehicles are emitted.<br />

For each simulation setting (strength of the adversary and intensity of the road traffic) 100<br />

simulations were performed.<br />

3.3.2 Simulation results<br />

Figure 3.3 contains the resulting success probabilities of the adversary as a function of her strength.<br />

The different curves belong to different traffic intensities. The results are quite intuitive: we can<br />

conclude that the stronger the adversary, the higher her success probability. Note, however, that<br />

from above a given strength, the success probability saturates at about 60 %. Higher success<br />

probabilities can not be achieved, because the order of the vehicles may change between junctions<br />

without the adversary being capable of tracking that. Note also that the saturation point is<br />

35


3. LOCATION PRIVACY IN VANETS<br />

reached with the control of only the half of the junctions. The intensity of the traffic is much less<br />

important parameter, than the strength of the attacker. The success probability of the attacker is<br />

nearly independent from the intensity of the traffic above a given attacker strength.<br />

Success probability of an attack [%]<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

Low traffic<br />

Medium traffic<br />

High traffic<br />

10<br />

0 10 20 30 40 50 60<br />

Number of attacker antennas<br />

Figure 3.3: Success probabilities of the adversary as a function of her strength. The three curves<br />

represent three different scenarios (the darker the line, the more intensive the traffic).<br />

The dark bars in Figure 3.4 show how the size of the set V of the vehicles that exit the mix<br />

zone during the observation period and from which the adversary has to decide to the selected<br />

vehicle varies with the strength of the adversary. The three sub-figures are related to the three<br />

different traffic situations (low traffic – left, medium traffic – middle, high traffic – right). While<br />

the size of V seems to be large (which seemingly makes the adversary’s decision difficult), it is<br />

also interesting to examine how uniform this set V is in terms of the probabilities assigned to the<br />

vehicles in V . Recall that the adversary computes a probability pjt for each vehicle v ′ in V , which<br />

is the probability of v ′ = v. These probabilities can be normalized to obtain a distribution, and the<br />

entropy of this distribution can be computed. From this entropy, I computed the effective size of V<br />

(i.e., the size to which V can be compressed due to the non-uniformity of the distribution over its<br />

members), and the light bars in the figure illustrate the obtained values. As we can see, the effective<br />

size of V is much smaller than its real size, which means that the distribution corresponding to<br />

the members of V is highly non-uniform. This is the reason why the adversary can be successful.<br />

3.4 Global attacker<br />

In the following part of this chapter, I assume a global eavesdropping attacker instead of a local<br />

attacker. A global eavesdropping attacker can hear all of the messages sent by the vehicles. This<br />

is a more challenging task, compared to the local attacker scenario. My work is inspired by the<br />

work of [Freudiger et al., 2007]. However, [Freudiger et al., 2007] requires the use of significant<br />

infrastructure. By replacing [Freudiger et al., 2007]’s cryptographic mix zones with zones of silence<br />

I address semantic mixing and infrastructure requirements simultaneously. In the following, in<br />

Section 3.5, I give a framework, where the minimal requirements for providing privacy for vehicles<br />

is analyzed. Next, in Section 3.6, I introduce my attacker model and my proposed solution, and in<br />

Section 3.7, I present the results of my experiments showing that my approach does indeed make<br />

tracing of vehicles hard for the attacker, and that it is usable in the real world.<br />

36


3.5. Framework for location privacy in VANETs<br />

Figure 3.4: The dark bars show how the size of the set V of the vehicles that exit the mix zone<br />

during the observation period varies with the strength of the adversary (y axis: number of attacker<br />

antennas). The three sub-figures are related to the three different traffic situations (low traffic –<br />

left, medium traffic – middle, high traffic – right). The light bars illustrate the effective size of V .<br />

As we can see, the effective size is much smaller than the real size, which means that distribution<br />

corresponding to the members of V is highly non-uniform.<br />

3.5 Framework for location privacy in VANETs<br />

Any system that aims to provide privacy for vehicles must address the following areas 2 :<br />

Syntactic privacy. In brief, all vehicles that use pseudonyms must change those pseudonyms<br />

from time to time. This area includes:<br />

N1 Pseudonymity: An identifier that is available to an eavesdropper must not be directly linkable<br />

to the vehicle (for example, it must not contain the VIN, the driver’s name, or anything else<br />

an eavesdropper might know).<br />

N2 Change of identifiers: Identifiers must change with some frequency 3 .<br />

N3 Local synchronization of change of identifiers: All identifiers, up and down the network<br />

stack, must change simultaneously. (This is not a communications issue as such, but a local<br />

engineering issue; however, it must be addressed).<br />

N4 Cooperative synchronization of change of identifiers or syntactic mixing: A vehicle in an<br />

observed area must change its identifier at the same time as at least one other vehicle and<br />

the two (or more) changing vehicles must do so in a way that allows semantic privacy as<br />

defined below 4 .<br />

N5 Pseudonym use: This covers two intermingled areas:<br />

2 This section is mainly based on the work of my coauthor, William Whyte [Buttyan et al., 2009] 3 The frequency<br />

of change that provides privacy to the level expected by a user will in practice often depend on local regulation.<br />

4 Otherwise, an attacker who sees, for instance, identifiers (A, B, C, D) at time t and (A, B, C, E) at time t + 1 will<br />

know that D and E refer to the same vehicle.<br />

37


3. LOCATION PRIVACY IN VANETS<br />

N5.1 Pseudonym format: What cryptographic mechanism is used by psuedonym owners to<br />

authenticate that they are valid units within the system?<br />

N5.2 Pseudonym issuance and renewal: How are pseudonyms issued? How does a vehicle<br />

avoid running out of them? (The answer to this may involve the identifier change<br />

frequency, N2.) What assumptions are necessary about the infrastructure to ensure<br />

that a vehicle is not left without pseudonyms?<br />

Semantic privacy. This captures the idea that vehicles must not be traceable by reconstructing<br />

the trajectories implied by their heartbeat messages. This area includes:<br />

M1 Semantic unlinkability: A vehicle’s stream of heartbeat messages must be interrupted at some<br />

frequency for some period of time.<br />

M2 Semantic mixing: Semantic unlinkability is valuable mainly in so far as it creates ambiguity<br />

for an attacker about whether a resumed stream of heartbeats comes from vehicle A or vehicle<br />

B.<br />

Robust privacy. This captures how misbehaving entities within the system may affect privacy and<br />

security. This area includes:<br />

R1 Privacy-preserving bad-actor removal: How is a misbehaving entity removed? Does this<br />

removal affect the privacy of its transmissions before it began to misbehave? Does its removal<br />

affect the privacy of other entities in the system?<br />

R2 Privacy against insider attacks: How is privacy protected against bad actors in Law Enforcement<br />

or at a Certificate Authority (CA)?<br />

This part of the chapter explicitly contributes in the area of syntactic mixing (N4), semantic<br />

mixing (M2), and semantic unlinkability (M1). The results are based on the assumption that<br />

pseudonyms are changed whenever the criteria are met. This will be fairly frequent, on the order<br />

of once every few minutes for urban driving, implicitly addressing N2. An identifier change frequency<br />

this high may require frequent reissuance of pseudonyms, limiting the choices possible in<br />

areas N5.1 and N5.2. To the best of my understanding, the following proposal is compatible with<br />

any reasonable solution for N1, N3, R1, or R2.<br />

3.6 Attacker Model and the SLOW algorithm<br />

A global attacker is assumed who can get mass coverage. Conceptually, the attacker might be the<br />

RSU network operator that has access to messages received by all RSUs, or the attacker might<br />

have set up a network covering an entire city 5 . This is clearly an extremely powerful attack model,<br />

perhaps too powerful to be plausible, but we can use this because if the system is secure in the<br />

face of this attacker it will be secure in the face of other, weaker attackers too.<br />

The attacker can use two basic mechanisms to link transmissions from a vehicle: (1) linking<br />

pseudonyms or other identifiers between heartbeat messages (syntactic linking), and (2) using the<br />

position and velocity information in the heartbeat messages to reconstruct the trajectory of the<br />

vehicle (semantic linking).<br />

We assume no supporting infrastructure in terms of an RSU network, therefore, vehicles must<br />

have a strategy to create their own mix zones, and that strategy must work even in the case where<br />

the attacker has 100% coverage. The defender’s mechanism is to turn off radio transmissions (to<br />

make semantic linking difficult) and change pseudonyms (to make syntactic linking difficult) while<br />

the radio is turned off without endangering safety of life.<br />

More precisely, the proposed solution, which is called SLOW for Silence at LOW speeds, works<br />

as follows. We can choose a threshold speed vT , say vT = 30 km/h. A vehicle will not broadcast<br />

5 Fraunhofer Institute has established that the hardware cost (ignoring the backhaul connections) to set up receivers<br />

covering all 900 km 2 of Berlin is about 250, 000 Euros.<br />

38


3.7. Analysis of SLOW<br />

any heartbeat message, or any other message containing location or trajectory data in the clear,<br />

if it is traveling below speed vT , unless this is necessary for safety- of-life reasons. If the vehicle<br />

has not sent a message for a certain period of time, then it changes pseudonyms (identifiers at all<br />

layer of the network stack and related certificates) before the next transmission. Traffic signals<br />

in a crowded urban area seem like an ideal location for such a pseudonym change: whenever a<br />

crowd of vehicles stop at a traffic signal, they may go into one of several lanes, they may choose<br />

to turn or not turn, and so on. Thus, mix-zones are created at the point where there is maximum<br />

uncertainty about exactly where a vehicle is and exactly what it is going to do next. This is also<br />

a safe set of circumstances under which to stop transmitting. Only 5% of pedestrians struck by a<br />

vehicle at 20 km/h die [Leaf and Preusser, 1999] while at 50 km/h the figure is 40%. Presumably,<br />

vehicle-to-vehicle collisions where both cars are traveling at 30 km/h result in even fewer fatalities.<br />

Situations can be defined as exceptions. For instance, if vehicle A is stopped at a signal, but<br />

vehicle B coming up behind it emits a heartbeat that lets vehicle A know that there is a risk of<br />

a collision, then vehicle A can send out a heartbeat to warn vehicle B to brake. We can note<br />

that the simulations do not include this exception case, because in practice these cases come up<br />

only rarely. Future research based on SLOW will investigate this exception case in greater detail.<br />

We can also note that an attacker can abuse exception cases to break the silent period, but this<br />

attacker (unless it is an inside attacker) can be tracked down by standard methods and revoked.<br />

Besides being very simple to implement, SLOW has other advantages. Traffic jams and slow<br />

traffic leads to a large amount of vehicles in transmission range and therefore requires extensive<br />

processing power to verify the digital signatures of all incoming heartbeat messages. By refraining<br />

from sending heartbeat messages, SLOW avoids the necessity of extensive signature verifications<br />

in traffic jams and slow traffic, and thus, reduces hardware cost. A more detailed analysis of<br />

the impact on computation complexity, as well as the level of privacy and safety provided by the<br />

scheme will be presented in the next section.<br />

3.7 Analysis of SLOW<br />

3.7.1 Privacy<br />

It must be intuitively clear that a vehicle frequently sending out heartbeat messages is easy to<br />

trace, but to the best of my knowledge, no accurate experiment confirms this statement in VANET<br />

settings. As field experiments cannot be done due to the lack of envisioned VANET infrastructure,<br />

simulations were carried out to measure the level of traceability in an urban setting. The SUMO<br />

[Krajzewicz et al., 2002] simulation environment was used, as it is a realistic, microscopic urban<br />

traffic simulator. SUMO was set to use a 100 Hz frequency for internal update of vehicle position<br />

and velocities, and every Nth position (N depending on the heartbeat frequency) was considered<br />

to be available to the attacker as a heartbeat.<br />

Note that tracing vehicles in an urban setting is essentially a multitarget tracking problem,<br />

which has an extensive literature, however, mostly related to radar development in the fields of<br />

aviation and sailing [Gruteser and Hoh, 2005]. Yet, the following tracking approach, consisting of<br />

three steps, can be adopted to the vehicular setting too: first, the actual position and speed of the<br />

targets are recorded by eavesdropping the heartbeat messages. Based on the position and speed<br />

information, a predicted new position is calculated, which can be further refined by the help of side<br />

information such as the layout of the streets, lanes etc. At the next heartbeat, the new positions<br />

are eavesdropped and matched with the predicted positions.<br />

We implemented an attacker that tracked the vehicles in the SUMO output based on the tracking<br />

approach described above. The attacker uses the last two heartbeat information to calculate<br />

the acceleration of the vehicles making the prediction of the next position more accurate. The<br />

vehicles are tracked from their departure to their destination. Tracking is considered successful, if<br />

the attacker has not lost a target through its entire journey.<br />

The results of the tracking of 50 vehicles are shown in Figure 3.5. As we can see, if the<br />

beaconing frequency is 5-10 Hz, which is needed for most of the safety applications, then 75-80%<br />

39


3. LOCATION PRIVACY IN VANETS<br />

Success rate of tracing [%]<br />

80<br />

75<br />

70<br />

65<br />

60<br />

55<br />

0 2 4 6 8 10<br />

Beacon frequency [1/s]<br />

Figure 3.5: Success rate of an attacker performing vehicle tracking by semantic linking of heartbeat<br />

messages when no defense mechanisms are in use.<br />

of the vehicles are tracked successfully. By evaluating the unsuccessful cases, we can observe that<br />

the target vehicles were lost at their destinations. More precisely, in the vast majority of the<br />

unsuccessful cases, when the target vehicle V1 arrived to its destination and stopped sending more<br />

messages, if an other vehicle V2 was in its vicinity, then the attacker continued tracking V2 as if<br />

it was V1. I counted this as unsuccessful case, because the attacker erroneously determined the<br />

destination of the target vehicle (i.e., it concluded that the destination of V1 was that of V2, and<br />

those two destinations have virtually never been the same). However, during the movement of the<br />

target vehicles (i.e., before they reached their destination), the attacker was able to track them<br />

with a remarkable 99% success rate. This confirms that semantic linking is a real problem.<br />

In any case, from a privacy point of view, a system where the users are traceable with probability<br />

0.75-0.8 is not acceptable. My proposed silent period scheme, where the vehicles stop sending<br />

heartbeat message below a given speed, mitigates this problem. It must be clear that the tracking<br />

algorithm described above does not work when the vehicles stop sending heartbeats regularly.<br />

Yet, the attacker may use other side information, such as the probability of turning to a given<br />

direction in an intersection, to improve the success probability of tracking despite the absence of<br />

the heartbeats. Thus, we need a new attacker model that also accounts for such side knowledge of<br />

the attacker.<br />

We can formalize the knowledge of the attacker as follows (for a summary of notations the<br />

reader is referred to Table 3.1): first, each intersection is modeled with a binary matrix J, where<br />

each row corresponds to an ingress lane and each column corresponds to an egress lane of the<br />

intersection, and Jij (the entry in the i-th row and j-th column) is 1 if it is possible to traverse<br />

the intersection by arriving in ingress lane i and leaving in egress lane j. As an example, consider<br />

the intersection shown in Figure 3.6 and its corresponding matrix J defined in matrix (3.1).<br />

⎛<br />

⎜<br />

J = ⎜<br />

⎝<br />

0 0 0 1 1<br />

0 0 1 0 0<br />

1 1 0 1 1<br />

0 0 1 0 0<br />

1 1 0 0 0<br />

40<br />

⎞<br />

⎟<br />

⎠<br />

(3.1)


Table 3.1: Notation in SLOW<br />

vT threshold speed<br />

J junction descriptor matrix<br />

m number of lanes towards the junction<br />

n number of lanes from the junction<br />

T probability distribution of the target’s lanes<br />

W number of waiting vehicles per lanes<br />

w number of waiting vehicles in the junction<br />

L list of egress events<br />

lD decision of the attacker<br />

ˆ l the target’s real egress event<br />

LS list of suspect events<br />

3.7. Analysis of SLOW<br />

Second, we can assume that the accuracy of GPS receivers does not permit to decide with certainty<br />

which lane of a road a given vehicle is using. Therefore, we can also assume that the attacker<br />

knows on which road a target vehicle enters the intersection, but it does not know which ingress<br />

lane it is using. Nevertheless, the attacker may have some a priori knowledge on the probability<br />

of an incoming vehicle choosing a given ingress lane on a given road in a given intersection; such<br />

knowledge may be acquired by visually observing the traffic in that intersection for some time.<br />

These probabilities can be arranged in an m dimensional vector T , where the i-th element Ti is<br />

the probability of choosing ingress lane i when entering the intersection on the road that contains<br />

ingress lane i. As an example, consider the intersection in Figure 3.6, and the vector<br />

T = (0.6, 0.4, 1, 0.8, 0.2)<br />

This would mean that vehicles arriving to the intersection on the road that contains ingress lanes<br />

1 and 2 choose lane 1 with probability 0.6 and lane 2 with probability 0.4. Note that vehicles<br />

arriving on the road that contains only ingress lane 3 have no choice, hence T3 in this example is<br />

1.<br />

Third, when multiple possible egress lanes correspond to a given ingress lane (i.e., there are<br />

more than one 1s in a given row of matrix J), we can assume that vehicles choose any of those egress<br />

lanes uniformly at random. For example, a vehicle arriving in ingress lane 1 of the intersection in<br />

Figure 3.6 can leave the intersection in egress lane 4 or 5 with equal probability.<br />

Finally, when the target vehicle arrives at an intersection, there may already be some other<br />

vehicles waiting or moving below the threshold speed in that intersection. The number of such<br />

silent vehicles in ingress lane i is denoted by Wi, and the m dimensional vector containing all<br />

Wi values is denoted by W . Note that due to the previous assumption that the attacker is not<br />

always able to precisely determine the ingress lane used by an incoming vehicle, it is also unable to<br />

determine the exact values of all Wi’s; nevertheless, it can use its experimental knowledge on the<br />

probabilities of choosing a given lane, represented by vector T , to at least estimate the Wi values.<br />

Let us denote by L the list of vehicles that leave the intersection (and thus restart sending<br />

heartbeats) after the target entered the intersection (and thus stopped sending more heartbeats).<br />

More precisely, each element Lk of list L is a (timestamp, road) pair (t, r) that represents a vehicle<br />

reappearing on road r at time t. The objective of the attacker is to decide which Lk corresponds<br />

to the target vehicle. Let us denote by ℓ the list element chosen by the attacker, and let ℓ ∗ be the<br />

list element that really corresponds to the target vehicle. The attacker is successful if and only if<br />

ℓ = ℓ ∗ .<br />

In theory, the optimal decision is the following:<br />

ℓ = arg max Pr(Lk|J, T, W, L)<br />

k<br />

where Pr(Lk|J, T, W, L) is the probability of Lk being the right decision given all the knowledge<br />

of the attacker. However, it seems to be difficult to calculate (or estimate) all these conditional<br />

41


3. LOCATION PRIVACY IN VANETS<br />

Figure 3.6: An example intersection, the corresponding matrix is given in (3.1)<br />

probabilities, as they have to be determined for every possible intersection (J), number of awaiting<br />

vehicles in the intersection (W ), and observation of egress events (L).<br />

Hence, I assume a more simplistic attacker that uses the following tracking algorithm: let us<br />

denote by w the total number of silent vehicles in the intersection when the target vehicle arrives<br />

and stops sending heartbeats. The attacker decides on the w-th element of L, unless that entry<br />

surely cannot correspond to the target (e.g., it is not possible to leave the intersection on the road<br />

in the w-th element of L given the road on which the target arrived to the intersection). When<br />

the w-th element of L must be excluded, the attacker chooses the next element on the list L that<br />

cannot be excluded.<br />

Our simple attacker model essentially assumes that traffic at an intersection follows the FIFO<br />

(First In First Out) principle. While this is clearly not the case in practice, the attacker still<br />

achieves a reasonable success rate in a single intersection as shown in Figure 3.7. One can see, for<br />

instance, that when the total number of vehicles is 100, the attacker can still track a target vehicle<br />

through a single intersection with probability around 1<br />

2 .<br />

Figure 3.8 shows the success rate of the attacker in the general case, when the target traverses<br />

multiple intersections between its starting and destination points. As expected, the tracking capabilities<br />

of the attacker in this case are worse than in the single intersection case. The quantitative<br />

results of the simulation experiments suggest that only around 10% of the vehicles can be tracked<br />

fully by the attacker when the threshold speed is larger than 22 km/h (approximately 6 m/s).<br />

The effectiveness of the attacker depends on the vT threshold speed and the density of the<br />

vehicles. In general the higher the threshold speed at which vehicles stop sending heartbeats,<br />

the higher the chance that the attacker loses the target (i.e., the lower the chance of successful<br />

tracking). Moreover, in a dense network, it is more difficult to track vehicles. Note, however, that<br />

there is an important difference in practice between the traffic density and the threshold speed,<br />

namely, that the threshold speed can be influenced by the owner of the vehicle, while the traffic<br />

density cannot be.<br />

42


Success rate of tracing [%]<br />

100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

0 2 4 6 8 10<br />

Threshold speed [m/s]<br />

3.7. Analysis of SLOW<br />

Figure 3.7: Success rate of the simple attacker in a single intersection. Different curves belong to<br />

different experiments with the total number of vehicles given in the legend.<br />

Success rate of tracing [%]<br />

100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

50<br />

100<br />

150<br />

200<br />

50<br />

100<br />

150<br />

200<br />

0<br />

0 2 4 6 8 10<br />

Threshold speed [m/s]<br />

Figure 3.8: Success rate of the simple attacker in the general case, when the target traverses multiple<br />

intersections between its starting and destination points. Different curves belong to different<br />

experiments with the total number of vehicles given in the legend.<br />

43


3. LOCATION PRIVACY IN VANETS<br />

3.7.2 Effects on safety<br />

The main objective of vehicular communications is to increase road safety. However, refraining<br />

from sending heartbeat messages may seem to be in contradiction with this objective. Note,<br />

however, that I propose to refrain from sending heartbeats only below a given threshold speed,<br />

and I argue below that this may not endanger the objective of road safety.<br />

According to [Leaf and Preusser, 1999], only 5% of pedestrians struck by a vehicle at 20 km/h<br />

die, while this figure is 40% at 50 km/h. In [Kloeden et al., 1997], it is shown that in a 60 km/h<br />

speed limit area, the risk of involvement in a casualty crash doubles with each 5 km/h increase in<br />

traveling speed above 60 km/h. In [Baruya, 1998], it is shown that 1 km/h change in speed can<br />

influence the probability of an accident by 3.45%.<br />

The statistical figures above show that at lower speed the probability of an accident is lower<br />

too. This is because usually vehicles go at lower speed in areas where the drivers need to be more<br />

careful (hence the speed limit). Thus, it makes sense to rely more on the awareness of the drivers<br />

to avoid accidents at lower speeds. On the other hand, at higher speeds, accidents can be more<br />

severe, and warning from the vehicular safety communication system can play a crucial role in<br />

avoiding fatalities.<br />

3.7.3 Effects on computation complexity<br />

A great challenge in V2V communication deployment is the processing power of the vehicles [Kargl<br />

et al., 2008]. The most demanding task of the On Board Unit (OBU) is the verification of the<br />

signatures on the received heartbeat messages. This problem can be partially handled by not<br />

attaching certificates to every heartbeat message [Calandriello et al., 2007], but it does not solve<br />

the problem of verifying the signatures on the messages.<br />

In principle, the heavier the traffic, the more vehicles are in each others communication range.<br />

More vehicles send more heartbeats overwhelming each other. The number of vehicles in communication<br />

range depends on the average speed of the traffic, assuming that the vehicles keep a safety<br />

distance between each other depending on their speed.<br />

In Figure 3.9, the results of some simple calculations can be seen showing the number of<br />

signature verifications performed as a function of the average speed. In this calculation, vehicles<br />

are assumed to follow each other within 2 seconds. The communication range is assumed to be<br />

100 m and the heartbeat frequency is 10 Hz. It can be seen in the figure that, in a traffic jam on<br />

an 8-lane road, each vehicle must verify as many as approximately 8,000 signatures per second. If<br />

SLOW is used with a threshold speed of around 30 km/h (approximately 8 m/s), then the vehicles<br />

never need to verify more than 1,000 signatures per second (assuming all other parameters are the<br />

same as before). This approach also works well in combination with congestion control where the<br />

transmission power is reduced in high density traffic scenarios. My approach therefore makes the<br />

hardware requirements of the OBU much lower and enables the use of less expensive devices.<br />

3.8 Related work<br />

The privacy of VANET’s is a recent topic. Many author addressed VANETs and its security and<br />

privacy in some papers (for example in [Aoki and Fujii, 1996; Luo and Hubaux, 2004; McMillin et<br />

al., 1998; Chisalita and Shahmehri, 2002; El Zarki et al., 2002; Dötzer, 2006; Hubaux et al., 2004;<br />

Raya and Hubaux, 2005; Raya and Hubaux, 2007; Gerlach, 2006; Ma et al., 2010; Wiedersheim et<br />

al., 2010]). A good online bibliography for the security of VANETs can be found in [Lin and Lu,<br />

2012]. The problem of providing location privacy for VANET’s is categorised in [Gerlach, 2006],<br />

into classes. The difference between the classes is the goal and the strength of the attacker. In<br />

[Choi et al., 2005], Choy et al. investigates how to obtain a balance between privacy and audit<br />

requirements in vehicular networks using only symmetric primitives. Ren et al. analyzes the<br />

location privacy problems in VANETs with attack trees in [Ren et al., 2011].<br />

Many privacy preserving techniques are suggested for on-line transactions (for example in<br />

[Chaum, 1988; Gulcu and Tsudik, 1996]). Mainly they are based on mix networks [Kesdogan<br />

44


Number of signatures to be verified [1/s]<br />

8000<br />

7000<br />

6000<br />

5000<br />

4000<br />

3000<br />

2000<br />

1000<br />

8 lanes<br />

6 lanes<br />

4 lanes<br />

2 lanes<br />

0<br />

0 5 10 15 20<br />

Speed [m/s]<br />

25 30 35 40<br />

3.8. Related work<br />

Figure 3.9: Number of signatures to be verified as a function of the average speed. The communication<br />

range is 100 m, and the heartbeat frequency is 10 Hz. Safety distance between the vehicles<br />

depends on their speed.<br />

et al., 1998; Reiter and Rubin, 1998], which was basically proposed by Chaum in 1981 [Chaum,<br />

1981]. A single mix collect messages mixes them and send them towards their destination. A mix<br />

networks consits of single mixes, which are linked together. In a mix network, some misbehaving<br />

mixes can not break the anonimity of the senders/receivers.<br />

An evident extension of mix networks to the off-line world is the the mix zones, proposed by<br />

Beresford et al. in [Beresford and Stajano, 2003; Beresford and Stajano, 2004]. A mix zone is a<br />

place where the users of the network are mixed, thus after leaving the mix zone, they can not be<br />

distinguished from each other.<br />

The problem of providing location privacy in wireless communication is well studied by Hu and<br />

Wang in [Hu and Wang, 2005]. They built a transaction-based wireless communication system<br />

in which transactions are unlinkable, and give a detailed simulation results. Their solution can<br />

provide location privacy for real-time applications as well.<br />

To qualify the operation of the mix zones, the offered anonomity must be measured. The first<br />

metric was proposed by Chaum [Chaum, 1988], was the size of the anonimity set. It is good metric<br />

only if any user leaving the mix zone is the target with the same probability. If the probabilities<br />

are different, then entropy based metric should be used. Entropy based metrics were suggested by<br />

Díaz et. al [Diaz et al., 2002] and Serjantov et al. [Serjantov and Danezis, 2003] at the same time.<br />

For the best of my knowledge, one the most relevant paper to SLOW is done by Sampigethaya<br />

et al. in [Sampigethaya et al., 2005; Sampigethaya et al., 2007]. In the paper, they study the<br />

problem of providing location privacy in VANET in the presence of a global adversary. A location<br />

privacy scheme called CARAVAN is also proposed. The main idea of the scheme is that random<br />

silent period [Huang et al., 2005] are used in the communication to avoid continous traceability.<br />

The solution is evaluated only in freeway model and in randomly generated manhattan street<br />

model. Lu et al. arrives to similar consequences as SLOW, namely, that the pseudonyms should<br />

be changed at intersections with high traffic in [Lu et al., 2012]. The main difference between the<br />

two approaches is that in their paper, the vehicles are aware of the possible zones from a predefined<br />

45


3. LOCATION PRIVACY IN VANETS<br />

map, so the mix zones are defined priori. They use a game theoretic approach to analyze their<br />

model.<br />

The change of pseudonyms may also have a detrimental effect, especially on the efficiency of<br />

routing and the packet loss ratio. In [Schoch et al., 2006], Schoch et al. investigated this problem<br />

and proposed some approaches that can guide system designers to achieve both a given level of<br />

privacy protection as well a reasonable level of performance.<br />

Another proposed approach provides multiple certificates in vehicles based on the combination<br />

of group signatures and multiple self-issued certificates [Calandriello et al., 2007; Armknecht et<br />

al., 2007]. The disadvantage is that On Board Units (OBUs) need to perform expensive group<br />

signature verification operations, and that OBUs are empowered to mount Sibyl attacks. [Studer<br />

et al., 2008] uses group signatures to request temporary certificates from a CA in an anonymous<br />

manner without the disadvantages of the previous scheme, but at the cost of an available connection<br />

to the CA. My solution suggested in Section 3.6 accounts for a global attacker without the support<br />

of the RSU infrastructure.<br />

3.9 Conclusion<br />

In the first half of this chapter from Section 3.2, I studied the effectiveness of changing pseudonyms<br />

to provide location privacy for vehicles in vehicular networks. The approach of changing pseudonyms<br />

to make location tracking more difficult was proposed in prior work, but its effectiveness<br />

has not been investigated yet. In order to address this problem, I defined a model based on the<br />

concept of the mix zone. I assumed that the adversary has some knowledge about the mix zone,<br />

and based on this knowledge, she tries to relate the vehicles that exit the mix zone to those that<br />

entered it earlier. I also introduced a metric to quantify the level of privacy enjoyed by the vehicles<br />

in this model. In addition, I performed extensive simulations to study the behavior of the model in<br />

realistic scenarios. In particular, in the simulation, I used a rather complex road map, generated<br />

traffic with realistic parameters, and varied the strength of the adversary by varying the number of<br />

her monitoring points. My simulation results provided detailed information about the relationship<br />

between the strength of the adversary and the level of privacy achieved by changing pseudonyms.<br />

I abstracted away the frequency with which the pseudonyms are changed, and I simply assumed<br />

that this frequency is high enough so that every vehicle surely changes pseudonym while in the mix<br />

zone. It seems that changing the pseudonyms frequently has some advantages as frequent changes<br />

increase the probability that the pseudonym is changed in the mix zone. On the other hand, the<br />

higher the frequency, the larger the cost that the pseudonym changing mechanism induces on the<br />

system in terms of management of cryptographic material (keys and certificates related to the<br />

pseudonyms). In addition, if for a given frequency, the probability of changing pseudonym in the<br />

mix zone is already close to 1, then there is no sense to increase the frequency further as it will<br />

no longer increase the level of privacy, while it will still increase the cost. Hence, there seems to<br />

be an optimal value for the frequency of the pseudonym change. Unfortunately, this optimal value<br />

depends on the characteristics of the mix zone, which is ultimately determined by the observing<br />

zone of the adversary, which is not known to the system designer.<br />

In the second half of the chapter from Section 3.4, I proposed a simple and effective privacy<br />

preserving scheme, called SLOW, for VANETs. SLOW requires vehicles to stop sending heartbeat<br />

messages below a given threshold speed (this explains the name SLOW that stands for “silence<br />

at low speeds”) and to change all their identifiers (pseudonyms) after each such silent period. By<br />

using SLOW, the vicinity of intersections and traffic lights become dynamically created mix zones,<br />

as there are usually many vehicles moving slowly at these places at a given moment in time. In<br />

other words, SLOW implicitly ensures a synchronized silent period and pseudonym change for<br />

many vehicles both in time and space, and this makes it effective as a location privacy enhancing<br />

scheme. Yet, SLOW is remarkably simple, and it has further advantages. For instance, it relieves<br />

vehicles of the burden of verifying a potentially large amount of digital signatures when the vehicle<br />

density is large, as this usually happens when the vehicles move slowly in a traffic jam or stop at<br />

46


3.10. Related publications<br />

intersections. Finally, the risk of a fatal accident at a slow speed is low, and therefore, SLOW does<br />

not seriously impact safety-of-life.<br />

I evaluated SLOW in a specific attacker model that seems to be realistic, and it proved to be<br />

effective in this model, reducing the success rate of tracking a target vehicle from its starting point<br />

to its destination down to the range of 10–30%.<br />

As a conclusion of this chapter, I analyzed what a local and a global eavesdropping attacker<br />

can do when trying to trace vehicles in VANETs, and gave an efficient countermeasure against the<br />

stronger global attacker.<br />

3.10 Related publications<br />

[Buttyan et al., 2007] Levente Buttyan, Tamas Holczer, and Istvan Vajda. On the effectiveness of<br />

changing pseudonyms to provide location privacy in vanets. In Proceedings of the Fourth European<br />

Workshop on Security and Privacy in Ad hoc and Sensor Networks (ESAS2007). Springer, 2007.<br />

[Papadimitratos et al., 2008] Panagiotis Papadimitratos, Antonio Kung, Frank Kargl, Zhendong<br />

Ma, Maxim Raya, Julien Freudiger, Elmar Schoch, Tamas Holczer, Levente Buttyán, and<br />

Jean pierre Hubaux. Secure vehicular communication systems: design and architecture. IEEE<br />

Communications Magazine, 46(11):100–109, 2008.<br />

[Holczer et al., 2009] Tamas Holczer, Petra Ardelean, Naim Asaj, Stefano Cosenza, Michael<br />

Müter, Albert Held, Björn Wiedersheim, Panagiotis Papadimitratos, Frank Kargl, and Danny De<br />

Cock. Secure vehicle communication (sevecom). Demonstration. Mobisys, June 2009.<br />

[Buttyan et al., 2009] Levente Buttyan, Tamas Holczer, Andre Weimerskirch, and William<br />

Whyte. Slow: A practical pseudonym changing scheme for location privacy in vanets. In Proceedings<br />

of the IEEE Vehicular Networking Conference, pages 1–8. IEEE, IEEE, October 2009.<br />

47


Chapter 4<br />

Anonymous Aggregator Election and Data<br />

Aggregation in Wireless Sensor Networks<br />

4.1 Introduction<br />

Wireless sensor and actuator networks are potentially useful building blocks for cyber-physical<br />

systems. Those systems must typically guarantee high-confidence operation, which induces strong<br />

requirements on the dependability of their building blocks, including the wireless sensor and actuator<br />

network. Dependability means resistance against both accidental failures and intentional<br />

attacks, and it should be addressed at all layers of the network architecture, including the networking<br />

protocols and the distributed services built on top of them, as well as the hardware and<br />

software architecture of the sensor and actuator nodes themselves. Within this context, in this<br />

chapter, I focus on the security aspects of aggregator node election and data aggregation protocols<br />

in wireless sensor networks.<br />

Data aggregation in wireless sensor networks helps to improve the energy efficiency and the<br />

scalability of the network. It is typically combined with some form of clustering. A common<br />

scenario is that sensor readings are first collected in each cluster by a designated aggregator node<br />

that aggregates the collected data and sends only the result of the aggregation to the base station.<br />

In another scenario, the base station may not be present permanently in the network, and the<br />

aggregated data must be stored by the designated aggregator node in each cluster temporarily<br />

until the base station can eventually fetch the data. In both cases, the amount of communication,<br />

and hence, the energy consumption of the network can be greatly reduced by sending aggregated<br />

data, instead of individual sensor readings, to the base station.<br />

While data aggregation in wireless sensor networks is clearly advantageous with respect to<br />

scalability and efficiency, it introduces some security issues. In particular, the designated aggregator<br />

nodes that collect and store aggregated sensor readings and communicate with the base station are<br />

attractive targets of physical node destruction and jamming attacks. Indeed, it is a good strategy<br />

for an attacker to locate those designated nodes and disable them, because he can prevent the<br />

reception of data from the entire cluster served by the disabled node. Even if the aggregator role<br />

is changed periodically by some election process, some security issues remain, in particular in the<br />

case when the base station is off-line and the aggregator nodes must store the aggregated data<br />

temporarily until the base station goes on-line and retrieves them. More specifically, in this case,<br />

the attacker can locate and attack the node that was aggregator in a specific time epoch before<br />

the base station fetches its stored data, leading to permanent loss of data from the given cluster<br />

in the given epoch.<br />

In order to mitigate this problem, I introduced the concept of private aggregator node election,<br />

and I proposed the first private aggregator node election protocol. Briefly, the first protocol<br />

ensures that the identity of the elected aggregator remains hidden from an attacker who observes<br />

49


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

the execution of the election process. However, this protocol ensures only protection against an<br />

external eavesdropper that cannot compromise sensor nodes, and it does not address the problem<br />

of identifying the aggregator nodes by means of traffic pattern analysis after the election phase.<br />

In the second protocol, I addressed the shortcomings of the first scheme: I proposed a new<br />

private aggregator node election protocol that is resistant even to internal attacks originating<br />

from compromised nodes, and I also proposed a new private data aggregation protocol and a new<br />

private query protocol which preserved the anonymity of the aggregator nodes during the data<br />

aggregation process and when they provide responses to queries of the base station. In the second<br />

private aggregator node election protocol, each node decides locally in a probabilistic manner to<br />

become an aggregator or not, and then the nodes execute an anonymous veto protocol to verify if<br />

at least one node became aggregator. The anonymous veto protocol ensures that non-aggregator<br />

nodes learn only that there exists at least one aggregator in the cluster, but they do not learn<br />

any information on its identity. Hence, even if such a non-aggregator node is compromised, the<br />

attacker learns no useful information regarding the identity of the aggregator.<br />

The protocols can be used to protect sensor network applications that rely on data aggregation<br />

in clusters, and where locating and then disabling the designated aggregator nodes is highly<br />

undesirable. Such applications include high-confidence cyber-physical systems where sensors and<br />

actuators monitor and control the operation of some critical physical infrastructure, such as an<br />

energy distribution network, a drinking water supply system, or a chemical pipeline. A common<br />

feature of these systems is that they have a large geographical span, and therefore, the sensor<br />

network must be organized into clusters and use in-network data aggregation in order to ensure<br />

scalability and energy efficient operation. Moreover, due to the mission critical nature of these<br />

applications, it is desirable to prevent the identification of the aggregator nodes in order to limit<br />

the impact of a successful attack against the sensor network. The first protocol that resist only an<br />

external eavesdropper is less complex than the second protocol that works in a stronger attacker<br />

model. Hence, the first protocol can be used in case of strong resource constraints or when the<br />

risk of compromising sensor nodes is limited (e.g., it may be difficult to obtain physical access to<br />

the nodes). The second protocol is needed when the risk of compromised and misbehaving nodes<br />

cannot be eliminated by other means.<br />

The remainder of the chapter is organized as follows: in Section 4.2, I introduce my system<br />

and attacker models. In Section 4.3, I present my basic aggregator election protocol which can<br />

withstand external attacks, while in Section 4.4, I introduce my advanced protocols, which can<br />

withstand internal aggregator identification and scamming attackers as well. In Section 4.5, I give<br />

an overview of some related work, and in Section 4.6, I conclude the chapter and sketch some<br />

future research directions.<br />

4.2 System and attacker models<br />

A sensor network consists of sensor nodes that communicate with each other via wireless channels.<br />

Every node can generate sensor readings, and store it or forward it to another node. Each node can<br />

directly communicate with the nodes within its radio range; those nodes are called the (one-hop)<br />

neighbors of the node. In order to communicate with distant nodes (outside the radio range),<br />

the nodes use multi-hop communications. The sensor network has an operator as well, who can<br />

communicate with some of the nodes through a special node called base station, or can communicate<br />

directly with the nodes if the operator moves close to the network.<br />

Throughout the chapter, a data driven sensor network is envisioned, where every sensor node<br />

sends its measurement to a data aggregator regularly. Such data driven networks are used for<br />

regular inspection of monitored processes notably in critical infrastructures. Event driven networks<br />

can be used for reporting special usually dangerous but infrequent events like fire in a building.<br />

There is no need of clustering and data aggregation in event based systems, thus private cluster<br />

aggregator election and data aggregation is not applicable there. The third kind of network is the<br />

query driven network, where the operator sends a query to the network, and the network sends a<br />

50


4.2. System and attacker models<br />

response. This kind of functionality can be used with data driven networks, and can have privacy<br />

consequences, like the identity of the answering node should remain hidden.<br />

In the following, it is assumed, that the time is slotted, and one measurement is sent to the<br />

data aggregator in each time slot. The time synchronization between the nodes is not discussed<br />

here, but a comprehensive survey can be found in [Faizulkhakov, 2007].<br />

It is assumed that every node shares some cryptographic credentials with the operator. These<br />

credentials are unique for every node, and the operator can store them in a lookup table, or can<br />

be generated from a master key and the node’s identifier on demand. The exact definition of the<br />

credentials can be found in Section 4.3.1 and in Section 4.4.1.<br />

The nodes may be aware of their geographical locations, and they may already be partitioned<br />

into well defined geographical regions. In this case, these regions are the clusters, and the objective<br />

of the aggregator election protocol is to elect an aggregator within each geographical region. We<br />

call this approach location based clustering; an example would be the PANEL protocol [Buttyán<br />

and Schaffer, 2010].<br />

A kind of generalization of the position based election is the preset case, where the nodes know<br />

the cluster ID they belong to before any communication. Here the goal of the election is to elect<br />

one node in every preset cluster. This approach is used in [Buttyán and Holczer, 2010].<br />

Alternatively, the nodes may be unaware of their locations or cluster IDs, and know only their<br />

neighbors. In this case, the clusters are not pre-determined, but they are dynamically constructed<br />

parallel to the election of the aggregators. Basically, any node may announce itself as an aggregator,<br />

and the nodes within a certain number of hops on the topology graph may join that node as cluster<br />

members. We call this approach topology based clustering; an example would be the LEACH<br />

protocol [Heinzelman et al., 2000].<br />

The location based and the topology based approaches are illustrated in Figure 4.1.<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 20 40 60 80 100<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0<br />

0 20 40 60 80 100<br />

Figure 4.1: Result of a location based (left), and topology based (right) one-hop aggregator election<br />

protocol. Solid dots represent the aggregators, and empty circles represent cluster members.<br />

Both approaches may use controlled flooding of broadcast messages. In case of location based<br />

or preset clustering, the scope of a flood is restricted to a given geographic region or preset cluster.<br />

Nodes within that region re-broadcast the message to be flooded when they receive it for the first<br />

time. Nodes outside of the region or having different preset cluster IDs simply drop the message.<br />

In case of topology based clustering, it is assumed that the broadcast messages has a Time-to-<br />

Live field that controls the scope of the flooding. Any node that receives a broadcast message<br />

with a positive TTL value for the first time will automatically decrement the TTL value and rebroadcast<br />

the message. Duplicates and messages with TTL smaller than or equal to zero are silently<br />

discarded. When I say that a node broadcasts a message, I mean such a controlled flooding (either<br />

location based, preset or topology based, depending on the context). In Section 4.4, connected<br />

dominating sets (CDS) are used to implement efficient broadcast messaging. The concept of CDS<br />

will be introduced there.<br />

51


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

We can call the set of nodes which are (in the location based and the preset case) or can<br />

potentially be (in the topology based case) in the same cluster as a node S the cluster peers of S.<br />

Hence, in the location based case, the cluster peers of S are the nodes that reside within the same<br />

geographic region as node S. In the preset case, the cluster peers are the nodes sharing the same<br />

cluster ID. In the topology based case, the set of cluster peers of S usually consists in its n-hop<br />

neighborhood, for some parameter n. The nodes may not explicitly know all their cluster peers.<br />

The main functional requirement of any clustering algorithm is that either node S or at least<br />

one of the cluster peers of S will be elected as aggregator.<br />

The leader of each cluster is called cluster aggregator, or simply aggregator. In the following I<br />

will use aggregator, cluster aggregator and data aggregator interchangeably.<br />

As mentioned in Section 4.1, an attacker can gain much more information by attacking an<br />

aggregator node than attacking a normal node. To attack a data aggregator node either physically<br />

or logically, first the attacker must identify that node. In this chapter I assume that the attacker’s<br />

goal is to identify the aggregator (which means that simply preventing, jamming or confusing the<br />

aggregation is not the goal of the attacker). In Section 4.4.5 I go a little further, and analyze what<br />

happens if a compromised node does not follow the proposed protocols in order to mislead the<br />

operator.<br />

An attacker who wants to discover the identity of the aggregators can eavesdrop the communication<br />

between any nodes, can actively participate in the communication (by deleting modifying<br />

and inserting messages) and can physically compromise some of the nodes. A compromised node<br />

is under the full control of the attacker, the attacker can fully review the inner state of that node,<br />

and can control the messages sent by that node.<br />

Compromising a node is a much harder challenge for an attacker than simply eavesdropping the<br />

communication. It requires physical contact with the node and some advanced knowledge, however<br />

it is far from impossible for an attacker with good electrical and laboratory background [Anderson<br />

and Kuhn, 1996]. So I propose two solutions. The first basic protocol can fully withstand a passive<br />

eavesdropper, but a compromising attacker can gain some knowledge about the identities of the<br />

cluster aggregators. The second advanced protocol can withstand a compromising attacker as well,<br />

with only leaking information about the compromised nodes.<br />

In case of a passive adversary, a rather simple solution could be based on a common shared<br />

global key. Using that shared global key as a seed of a pseudo random number generator, every<br />

node can construct locally (without any communications) the same pseudo randomly ordered list<br />

of all nodes. These lists will be identical for every node because all nodes use the same seed and the<br />

same pseudo random number generator. Then, the first A nodes of the list are elected aggregators<br />

such that every node can communicate with a cluster aggregator and no subset of A covers the<br />

whole system. An illustration of the result of this algorithm can be seen on Figure 4.1 for location<br />

based and topology based cluster aggregator election.<br />

The problem with this solution is that it is not robust: compromising a single node would leak<br />

the common key, and the adversary could compute the identifier of all cluster aggregators. While I<br />

do not want to fully address the problem of compromised nodes in the first protocol, I still aim at<br />

a more robust solution than the one described above. In particular, the system should not collapse<br />

by compromising just a single or a few nodes.<br />

The second protocol can withstand the compromise of some nodes without the degradation<br />

of the privacy of the cluster aggregators. This protocol meets the following goals and has the<br />

following limitations:<br />

The identity of the non-compromised cluster aggregators remains secret even in the presence<br />

of passive and active attackers or compromised nodes.<br />

The attacker can learn whether the compromised node is an aggregator.<br />

An attacker can force a compromised node to be aggregator, but does not know anything<br />

about the existence or identity of the other aggregators.<br />

The attacker cannot achieve that no aggregator is elected in the cluster, however all the<br />

elected aggregator(s) may be compromised nodes.<br />

52


4.3. Basic protocol<br />

The main difference between the first and second protocol is the following. The first protocol<br />

is very simple, but not perfect as a compromised node can reveal the identity of the aggregators.<br />

The second protocol requires more complex computations, but offers anonymity in case of node<br />

compromise as well. In some cases such complex computations are outside the capabilities of the<br />

nodes (or the probability of compromise is low), but anonymity is still required by the system.<br />

In these cases I suggest to use the first protocol. If the probability of node compromise is not<br />

negligible, then the use of the second protocol is recommended.<br />

4.3 Basic protocol<br />

In this section, I describe the basic protocol that I propose for private aggregator node election.<br />

First I give a brief overview of the basic principles of the protocol, and present the details later.<br />

After that, some important details of this basic protocol is presented in Section 4.3.2, where I also<br />

describe how to set the parameters of the protocol.<br />

4.3.1 Protocol description<br />

I assume that the nodes are synchronized (see [Faizulkhakov, 2007] for a survey on time synchronization<br />

mechanism for sensor networks), and each node starts executing the protocol roughly<br />

at the same time. The protocol terminates after a predefined fix amount of time. During the<br />

execution of the protocol, any node that has not received any aggregator announcement yet may<br />

decide to become an aggregator, in which case, it broadcasts an aggregator announcement message<br />

announcing itself as a cluster aggregator. This message is broadcast among the cluster peers of<br />

the node sending the announcement (see Section 4.2). Upon reception of a cluster aggregator<br />

announcement, any node that has neither announced itself as a cluster aggregator nor received any<br />

such announcement yet will consider the sender of the announcement as its cluster aggregator. In<br />

order to prevent an external observer to learn the identity of the cluster aggregators, all messages<br />

sent in the protocol are encrypted such that only the nodes to whom they are intended can decrypt<br />

them. For this, it is assumed that each node shares a common key with all of its cluster peers (an<br />

overview of available key establishment mechanisms for sensor networks can be found in [Lopez<br />

and Zhou, 2008]). In addition, in order to avoid that message originators are identified as cluster<br />

aggregators, the nodes that will be cluster members are required to send dummy messages that<br />

cannot be distinguished from the announcements by the external observer (i.e., they are encrypted<br />

and disseminated in the same way as the announcements).<br />

Note that the proposed basic protocol considers only either pairwise keys between the neighboring<br />

nodes or group keys shared between sets of neighboring nodes, so no global key is assumed.<br />

Such pairwise or group keys can be established by the techniques proposed in [Lopez and Zhou,<br />

2008]. The key establishment can be based on randomly selected key sets. In such a protocol, the<br />

probability that neighboring nodes share a common key is high, and the unused keys are deleted<br />

[Chan et al., 2003]. The key establishment can be also based on a common key which is deleted<br />

after some short time when the neighbors are discovered [Zhu et al., 2003]. Any node that owns<br />

the common key can generate a pairwise key with a node which owns or previously owned the<br />

common key. The basic method for exchanging a group/cluster key with the neighboring nodes is<br />

to send the same random key to each neighbor encrypted with the previously exchanged pairwise<br />

keys.<br />

The pseudo-code of the protocol is given in Algorithm 2, and a more detailed explanation<br />

of the protocol’s operation is presented below. The protocol consists of two rounds, where the<br />

length of each round is τ. The nodes are synchronized, they all know when the first round begins,<br />

and what the value of τ is. At the beginning, each node starts two random timers, T1 and T2,<br />

where T1 expires in the first round (uniformly at random) and T2 expires in the second round<br />

(uniformly at random). Each node also initializes at random a binary variable, called announFirst,<br />

that determines in which round the node would like to send a cluster aggregator announcement.<br />

53


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

Algorithm 2 Basic private cluster aggregator election algorithm<br />

start T1, expires in rand(0,τ) //timer, expires in round 1<br />

start T2, expires in rand(τ,2τ) //timer, expires in round 2<br />

announFirst = (rand(0,1) ≤ γ)<br />

CAID = -1 // ID of the cluster aggregator of the node<br />

while T1 NOT expired do<br />

if receive ENC(announcement) AND (CAID = -1) then<br />

CAID = ID of sender of announcement<br />

end if<br />

end while<br />

// T1 expired<br />

if announFirst AND (CAID = -1) then<br />

broadcast ENC(announcement);<br />

CAID = ID of node itself;<br />

else<br />

broadcast ENC(dummy);<br />

end if<br />

while T2 NOT expired do<br />

if receive ENC(announcement) AND (CAID = -1) then<br />

CAID = ID of sender of announcement<br />

end if<br />

end while<br />

// T2 expired<br />

if (NOT announFirst) AND (CAID = -1) then<br />

broadcast ENC(announcement);<br />

CAID = ID of node itself;<br />

else<br />

broadcast ENC(dummy);<br />

end if<br />

54


Table 4.1: Estimated time of the building blocks on a Crossbow MICAz mote<br />

Algorithm Generation [ms] Verification [ms]<br />

SHA-1 [Ganesan et al., 2003] 1.4 –<br />

RSA 1024 bit [Piotrowski et al., 2006] 12040 470<br />

RC4 [Ganesan et al., 2003] 0.1 0.1<br />

RC5 [Ganesan et al., 2003] 0.4 0.4<br />

4.3. Basic protocol<br />

The probability that announFirst is set to the first round is γ, which is a system parameter. The<br />

setting of γ is elaborated in Section 4.3.2.<br />

In the first round, every node S waits for its first timer T1 to expire. If S receives an announcement<br />

before T1 expires, then the sender of the announcement will be the cluster aggregator of S.<br />

When T1 expires, S broadcasts a message as follows: if announFirst is set to the first round and<br />

S has not received any announcement yet, then S sends an announcement, in which it announces<br />

itself as a cluster aggregator. Otherwise, S sends a dummy message. In both cases, the message is<br />

encrypted (denoted by ENC() in the algorithm) such that only the cluster peers of S can decrypt<br />

it.<br />

The second round is similar to the first round. When T2 expires S broadcasts a message as<br />

follows: if announFirst is set to the second round and S has not received any announcement yet,<br />

then S sends an announcement, otherwise, S sends a dummy message. In both cases, the message<br />

is encrypted.<br />

It is easy to see that at the end of the second round each node is either a cluster aggregator or<br />

it is associated with a cluster aggregator whose ID is stored in variable CAID. Without the second<br />

round, a node can remain unassociated, if it sends and receives only dummy messages in the first<br />

round. In addition, a passive observer only sees that every node sends two encrypted messages,<br />

one in each round. This makes it difficult for the adversary to identify who the cluster aggregators<br />

are (see also more discussion on this in the next section). In addition, if a node is compromised,<br />

the adversary learns only the identity of the cluster aggregators whose announcements have been<br />

received by the compromised node.<br />

In WSNs, it must be analyzed what happens if some messages are delayed or lost in the noisy<br />

unreliable channel. Two cases must be analyzed, dummy messages and announcements. If a<br />

dummy message is delayed or not delivered successfully to all recipients, then the result of the<br />

protocol is not modified as dummy messages serve for only covering the announcements. If an<br />

announcement is delayed or not delivered to a node, then the recipient will not select the sender as<br />

cluster aggregator. It will select a node who sent the announcement later or the node elects itself<br />

and sends an announcement. The message loss may modify the resulting set of cluster aggregators,<br />

but neither harm the anonymity of the elected aggregators, nor harm the original goal of cluster<br />

aggregator election (a node must be either a cluster aggregator or a cluster aggregator must be<br />

elected from the nodes cluster peers).<br />

Note that two neighboring nodes can send an announcement at the same time with some small<br />

probability. Actually, it is not a problem in the protocol. The only result is that both nodes<br />

will be cluster aggregators independently. As it is not conflicting with the original goal of cluster<br />

aggregator election, this infrequent situation does not need any special attention.<br />

The overhead introduced by the basic protocol is sending two encrypted messages for each<br />

election round. Other protocols [Buttyán and Schaffer, 2010; Heinzelman et al., 2000] uses one (or<br />

zero) unencrypted messages to elect an aggregator. So the number of messages sent in the election<br />

phase is slightly larger compared to other solutions. The symetric encryption also causes some<br />

extra overhead (for details, see Table 4.1, rows with RC4 and RC5).<br />

55


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

4.3.2 Protocol analysis<br />

In this section the previously suggested basic protocol is analyzed. As defined in Section 4.2, the<br />

main goal of the attacker is to reveal the identity of the cluster aggregators. To do so, the attacker<br />

can eavesdrop, modify, and delete messages, and can capture some nodes.<br />

First the logical attacks are analyzed where the attacker does not capture any nodes, then the<br />

results of a node capture.<br />

The attackers main goal is to reveal the identity of the cluster aggregators. As all the inter node<br />

communication is encrypted and authenticated, it cannot get any information from the messages<br />

themselves, but it can get some side information from simple traffic and topology analysis.<br />

Density based attack<br />

Thanks to the dummy messages and the encryption in the basic protocol, an external observer<br />

cannot trivially identify the cluster aggregators; however, it can still use side information and<br />

suspect some nodes to be cluster aggregators with higher probability than some other nodes. Such<br />

a side information is the number of the cluster peers of the nodes. This number correlates with<br />

the local density of the nodes, that is why this attack is called density based attack. Indeed, the<br />

probability of becoming a cluster aggregator depends on the number of the cluster peers of the<br />

node. For instance, if a node does not have any cluster peers, it will be a cluster aggregator with<br />

probability one. On the other hand, if the node has a larger number of cluster peers, then the<br />

probability of receiving an announcement from a cluster peer is large, and hence, the probability<br />

that the node itself becomes cluster aggregator is small. Note also that the number of cluster peers<br />

can be deduced from the topology of the network, which may be known to the adversary.<br />

The probability of becoming a cluster aggregator is approximately inversely proportional to the<br />

number of cluster peers:<br />

Pr(CA(S)) ∼ = 1<br />

D(S)<br />

where CA(S) is the event of S being elected cluster aggregator, and D(S) is the number of cluster<br />

peers of node S. Figure 4.2 illustrates this proportionality where the curve belongs to Equation 4.1<br />

and the plotted dots correspond to simulation results (100 nodes, random deployment, one hop<br />

communication, topology based clustering). It can be seen, that Equation 4.1 is quite sharp, it is<br />

very close to the simulated results.<br />

Two approaches can be used to mitigate this problem. One is to take the number of cluster<br />

peers of the nodes into account when generating the random timers for the protocol. The second<br />

is to balance the logical network topology in such a way that every node has the same number of<br />

cluster peers. In the following a possible solution for both approaches is introduced.<br />

The first approach can be the fine tuning of the distributions. It is not analyzed here deeply,<br />

because it can only slightly modify the probabilities of being cluster aggregator, so it has no large<br />

effect. An example can be seen on Figure 4.3, where the 10 th power of D(S) is used as a normalizing<br />

factor, when γ (probability of sending an announcement in the first round) is computed. The<br />

coefficients of the polynomial are set as resulting curve is the closest to uniform distribution. It can<br />

be seen, that modifying γ on a per node basis does not eventually reaches its goal, the normalized<br />

distribution is far from uniform. Actually by modifying γ, the other attack discussed in the next<br />

section can be mitigated, so here I propose a solution which does not set the γ parameter.<br />

The second approach modifies the number of cluster peers of a node to reach a common value.<br />

Let us denote this value by α.<br />

An efficient approach to mitigate this problem is to modify the number of cluster peers such<br />

that it becomes a common value α for all of them. In theory, this common value can be anything<br />

between 1 and the total number N of the nodes in the network. In practice, it should be around<br />

the average number of cluster peers, which can be estimated locally by the nodes. For example,<br />

assuming one-hop communications (meaning that the cluster peers are the radio neighbors), the<br />

following formula can be used:<br />

56<br />

(4.1)


Probability of being cluster aggregator<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

Simulation<br />

Analytical<br />

0<br />

0 5 10 15 20 25<br />

Number of cluster peers<br />

4.3. Basic protocol<br />

Figure 4.2: Probability of being cluster aggregator as a function of the number of cluster peers.<br />

α = (N − 1) R2 π<br />

A<br />

+ 1 ≃ E(D(S)) (4.2)<br />

where R is the radio range, and A is the size of the total area of the network. The formula is based<br />

on the fact that the number of cluster peers is proportional to the ratio between radio coverage<br />

and total area. Similar formulae can be derived for the general case of multi-hop communication.<br />

If a node S has more than α cluster peers it can simply discard the messages from D(S) − α<br />

randomly chosen cluster peers. If S has less than α cluster peers it must get new cluster peers by<br />

the help of its actual cluster peers (if S has not got any cluster peers originally, then it will always<br />

become a cluster aggregator). The new cluster peers can be selected from the set of cluster peers of<br />

the original cluster peers. To explore the potential new cluster peers, every node can broadcast its<br />

list of cluster peers within its few hop neighborhood before running the basic protocol. From the<br />

lists of the received cluster peers, every node can select its α − D(S) new cluster peers uniformly<br />

at random. Then, the basic aggregator election protocol can be executed using the balanced set of<br />

cluster peers. An example for this balancing is shown in Figure 4.4 (70 nodes, random deployment,<br />

one hop communication, topology based clustering).<br />

After running the balancing protocol, every node can approach the envisioned α value. The<br />

advantage of the balancing protocol is that however an attacker can gather the information about<br />

the number of cluster peers, this number is efficiently balanced after the protocol. The drawback of<br />

this solution is that it requires the original cluster peers to relay messages between distant nodes.<br />

One can imagine this solution as selectively increasing the TTL of protocol messages creating much<br />

larger neighborhoods.<br />

Order based attack<br />

Another important side information an attacker can use is the order in which the nodes send<br />

messages in the first round of the protocol. Indeed, the sender of the i-th message will be cluster<br />

aggregator if none of the previous i − 1 messages are announcements (but dummies) and the i-th<br />

message is an announcement. Thus, the probability Pi that the sender of the i-th message becomes<br />

cluster aggregator depends on i and parameter γ:<br />

57


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

Probability of being cluster aggregator<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

Normalized<br />

Original<br />

0<br />

0 5 10 15 20 25<br />

Number of cluster peers<br />

Figure 4.3: Probability of being cluster aggregator as a function of number of cluster peers. The<br />

analytical values comes from Equation 4.1, while the simulation values come from simulation,<br />

where the γ probabilities are normalized with the number of cluster peers of the nodes.<br />

Pi = (1 − γ) i−1 γ, 1 ≤ i ≤ n<br />

The (n + 1)-th element of the distribution is the probability that no announcement is sent in<br />

the first round:<br />

Pn+1 = (1 − γ) n<br />

in which case the sender of the first message of the second round must be a cluster aggregator.<br />

The entropy of this distribution characterizes the uncertainty of the attacker who wants to<br />

identify the cluster aggregator using the order information. Assuming that the number of cluster<br />

peers has been already balanced, this entropy can be calculated as follows:<br />

Number of cluster peers<br />

40<br />

30<br />

20<br />

10<br />

0<br />

0 20 40<br />

Node ID<br />

60<br />

Number of cluster peers<br />

40<br />

30<br />

20<br />

10<br />

0<br />

0 20 40<br />

Node ID<br />

60<br />

Figure 4.4: Result of balancing. The 70 nodes are represented on the x axis. The number of cluster<br />

peers before (left), and after (right) the balancing are represented on the y axis.<br />

58


− n∑<br />

i=1<br />

H = − n+1 ∑<br />

4.3. Basic protocol<br />

Pi log Pi = (4.3)<br />

i=1<br />

(<br />

(1 − γ) i−1 (<br />

γ log (1 − γ) i−1 ))<br />

γ −<br />

− (1 − γ) n log (1 − γ) n<br />

where γ is the probability of sending an announcement in the first round and n is the balanced<br />

number of cluster peers.<br />

Entropy<br />

3.5<br />

3<br />

2.5<br />

2<br />

1.5<br />

1<br />

0.5<br />

0<br />

0 0.2 0.4 0.6 0.8 1<br />

γ<br />

Figure 4.5: Entropy of the attacker as a function of sending announcement in the first round (γ).<br />

Number of nodes in one cluster: 10.<br />

In Figure 4.5, I plotted formula (4.3). If γ is large, then the uncertainty of the attacker is low,<br />

because one of the first few senders will become the cluster aggregator with very high probability.<br />

If γ is very small, then the uncertainty of the attacker is small again, because no cluster aggregator<br />

will be elected in the first round with high probability, and therefore, the first sender of the second<br />

round will be the cluster aggregator. The ideal γ value corresponds to the maximum entropy,<br />

which can be easily computed by the nodes locally from formula (4.3). For instance, Table 4.2<br />

shows some ideal γ values for different number of nodes in one cluster. The fifth row (Hmax)<br />

shows the maximal entropy (uncertainty) that any kind of election protocol can achieve with the<br />

given number of nodes. This is achieved if every node is equiprobably elected from the viewpoint<br />

of the attacker. This value is closely approached by H(ˆγ), where ˆγ is very close to the optimal<br />

solution (the difference between the found value and the optimal value can be arbitrarily small,<br />

and depends on the number of iterations the estimation algorithm uses). Using the found ˆγ value,<br />

the order of the messages has no meaning for the attacker.<br />

Node capture attacks<br />

If an attacker can compromise a node, it can reveal some sensitive information, even when the<br />

system uses the local key based protocol. If the compromised node is a cluster aggregator, then<br />

all the previously stored messages can be revealed. The attacker can decide to demolish the node,<br />

modify the stored values, simply use the captured data, or modify the aggregation functions.<br />

59


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

Table 4.2: Optimal γ values (ˆγ) for different number of nodes in one cluster. Achieved entropy<br />

(H(ˆγ)) and maximal entropy (Hmax = log 2 n)<br />

n 10 25 50 100<br />

ˆγ 0.167 0.082 0.049 0.027<br />

nˆγ 1.67 2.05 2.45 2.7<br />

H(ˆγ) 3.281 4.410 5.312 6.218<br />

Hmax 3.322 4.644 5.644 6.644<br />

If the compromised node is not a cluster aggregator, then the attacker can reveal the cluster<br />

aggregator of that node, which can result in the same situation described in the previous paragraph.<br />

4.3.3 Data forwarding and querying<br />

The problem of forwarding the measured data to the aggregators without revealing the identity<br />

of the aggregators is a well known problem in the literature, called anonymous routing [Seys and<br />

Preneel, 2006; Zhang et al., 2006; Rajendran and Sreenaath, 2008].<br />

Anonymous routing let us route packets in the network without revealing the destination of<br />

the packet. A short overview of anonymous routing can be found in Section 4.5.<br />

With anonymous routing any node can send the measurements to the aggregators without<br />

revealing the identity of it. An operator can query the aggregator with the help of an ordinary<br />

node which uses anonymous routing towards the aggregator.<br />

Anonymous routing introduces significant overhead in the traffic. However this can be partially<br />

mitigated by synchronizing the data transmissions. Instead of suggesting such an approach, in this<br />

chapter I elaborate a more challenging situation where the identity of the aggregators is unknown<br />

to the cluster members as well in Section 4.4.3. The clear advantage is that even if a node is<br />

compromised, it’s aggregator cannot be identified.<br />

4.4 Advanced protocol<br />

The advanced private data aggregation protocol is designed to withstand the compromise of some<br />

nodes without revealing the identities of the aggregator. The protocol consists of four main parts.<br />

The first part is the initialization, which provides the required communication channel. The second<br />

part is needed for the data aggregator election. This subprotocol must ensure that the cluster does<br />

not remain without a cluster aggregator. This must be done without revealing the identity of the<br />

elected aggregator. The third part is needed for the data aggregation. This subprotocol must be<br />

able to forward the measured data to the aggregator without knowing its identifier. The last part<br />

must support the queries, where an operator queries some stored aggregated data.<br />

In the following, the description of each subprotocol follows the same pattern. First the goal<br />

and the requirements of the subprotocol are discussed, then the subprotocol itself is presented.<br />

After the presentation of the subprotocol, I analyze how it achieves its goal even in the presence<br />

of an attacker, and what data and services it provides to the next subprotocol.<br />

At the end of this section, misbehavior is analyzed. I discuss, what an attacker can achieve,<br />

if its goal is not to identify the aggregators of the cluster, but to confuse the operation of the<br />

protocols.<br />

In the following, it is assumed that every node knows which cluster it belongs to. The protocol<br />

descriptions are considering only one cluster, and separate instances of the protocol are run in<br />

different clusters independently.<br />

The complexity of each subprotocol is summarized in Table 4.3. This table gives an overview of<br />

the message complexity of the used subprotocols, so the bandwidth requirements can be calculated<br />

from it. It can be seen, that the rarely used election protocol has the highest complexity, and the<br />

frequently used aggregation is the most lightweight protocol in use.<br />

60


4.4. Advanced protocol<br />

Table 4.3: Summary of complexity of the advanced protocol. N is the number of nodes in the<br />

cluster<br />

Election Aggregation Query<br />

Message complexity O(N 2 ) O(N) O(N)<br />

Modular exponentiations 4N 1 0 0<br />

Hash computations 0 0 1<br />

4.4.1 Initialization<br />

The initialization phase is responsible for providing the medium for authenticated broadcast communication.<br />

In the following, I shortly review the approaches of broadcast authentication in wireless<br />

sensor networks, and give some efficient methods for broadcast communication.<br />

The initialization relies on some data stored on each node before deployment. Each node<br />

has some unique cryptographic credentials to enable authentication, and is aware of the cluster<br />

identifier it belongs to. In the following, without further mentioning, it is assumed, that each<br />

message contains the cluster identifier. Every message addressed to a cluster different from the one<br />

a node belongs to is discarded by the node. First, I briefly review the state of the art in broadcast<br />

authentication, then I propose a connected dominating set based broadcast communication method,<br />

which fits well to the following aggregation and query phases.<br />

Broadcast authentication<br />

Broadcast authentication enables a sender to broadcast some authenticated messages efficiently to<br />

a big number of potential receivers. In the literature, this problem is solved with either digital<br />

signatures or hash chains. In this section, I review some solutions from both approaches.<br />

For the sake of completeness, Message Authentication Codes (MAC) must also be mentioned<br />

here [Preneel and Oorschot, 1999]. MACs are based on symmetric cryptographic primitives, which<br />

enable very efficient computation. Unfortunately, the verifier of a MAC must also possess the<br />

same cryptographic credential the generator used for generating the MAC. It means that every<br />

node must know every credential in the network, to verify every message broadcast to the network.<br />

This full knowledge can be exploited by an attacker who compromises a node. The attacker can<br />

impersonate any other honest node, which means that if only one node is compromised, message<br />

authenticity can no longer be ensured.<br />

One solution to the node compromise is the hop by hop authentication of the packets. In hop<br />

by hop authentication, every packets authentication information is regenerated by every forwarder.<br />

In this case, it is enough to only have a shared key with the direct neighbors of a node. In case<br />

of node compromise, only the node itself and the direct neighbors can be impersonated. Such a<br />

neighborhood authentication is provided by Zhu et al. in LEAP [Zhu et al., 2003], where it is<br />

based on so called cluster keys.<br />

To make the authentication scheme robust against node compromise, one approach is the usage<br />

of asymmetric cryptography, namely digital signatures.<br />

Digital signatures are asymmetric cryptographic primitives, where only the owner of a private<br />

key can compute a digital signature over a message, but any other node can verify that signature.<br />

Computing a digital signature is a time consuming task for a typical sensor node, but there exist<br />

some efficient elliptic curve based approaches in the literature [Liu and Ning, 2008; Szczechowiak<br />

et al., 2008; Oliveira et al., 2008; Xiong et al., 2010].<br />

One of the first publicly available implementations was the TinyECC module written by Liu and<br />

Ning [Liu and Ning, 2008]. A more efficient implementation is the NanoECC module. Proposed<br />

by Szczechowiak et al. [Szczechowiak et al., 2008]. It is based on the MIRACL cryptographic<br />

library [mir, ] . Up to now, to the best of my knowledge, the fastest implementations are the<br />

1 4 exponentiations for generating the two messages with knowledge proofs and 4N-4 exponentiations for checking<br />

the received knowledge proofs<br />

61


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

TinyPBC by Oliveira et al. [Oliveira et al., 2008], which is based on the RELIC toolkit [rel, ], and<br />

the TinyPairing proposed by Xiong et al. in [Xiong et al., 2010].<br />

Another approach is proposed for broadcast authentication in wireless sensor networks by Perrig<br />

et al. in [Perrig et al., 2002]. The µTESLA scheme is based on delayed release of hash chain<br />

values used in MAC computations. The scheme needs secure loose time synchronization between<br />

the nodes. The µTESLA scheme is efficient if it is used for authenticating many messages, but<br />

inefficient if the messages are sparse. Consequently, if only the rarely sent election messages must<br />

be authenticated, then the time synchronization itself can cause a heavier workload then simple<br />

digital signatures. If the aggregation messages must also be authenticated, then µTESLA can<br />

be an efficient solution. A DoS resistant version specially adapted for wireless sensor networks is<br />

proposed by Liu et al. in [Liu et al., 2005]. A faster but less secure modification is proposed by<br />

Huang et al. in [Huang et al., 2009].<br />

In the following it is assumed, that an efficient broadcast authentication scheme is used without<br />

any indication.<br />

Broadcast communication<br />

Broadcast communication is a method that enables sending information from one source to every<br />

other participant of the network. In wireless networks it can be implemented in many ways, like<br />

flooding the network or with a sequence of unicast messages.<br />

A natural question would be, why broadcast communication is so important to the advanced<br />

protocol? The reason is that only broadcast communication can hide the traffic patterns of the<br />

communication, thus not revealing any information about the aggregators.<br />

An efficient way of implementing broadcast communication in wireless sensor networks is the<br />

usage of connected dominating set (CDS). The connected dominating set S of graph G is defined<br />

as a subset of G such that every vertex in G − S is adjacent to at least one member of S, and S is<br />

connected. A graphical representation of a CDS can be found in Figure 4.6. The minimum connected<br />

dominating set (MCDS) is a connected dominating set with minimum cardinality. Finding<br />

a MCDS in a graph is an NP-Hard problem, however there are some efficient solutions which can<br />

find a close to minimal CDS in WSNs. For a thorough review of the state of the art of CDS in<br />

WSNs, the interested reader is referred to [Blum et al., 2004a] and [Jacquet, 2004].<br />

In the following, it is assumed that a connected dominating set is given in each cluster, and a<br />

minimum spanning tree is generated between the nodes in the CDS. Finding a minimum spanning<br />

tree in a connected graph is a well known problem for decades. Efficient polynomial algorithms<br />

are suggested in [Kruskal, 1956; Prim, 1957]. This kind of two layer communication architecture<br />

enables the efficient implementation of different kind of broadcast like communications, which are<br />

required for the following protocols. The spanning tree is used in the aggregation protocol in<br />

Section 4.4.3.<br />

The simple all node broadcast communication can be implemented simply: if a node sends a<br />

packet to the broadcast address, then every node in the CDS forwards this message to the broadcast<br />

address. The CDS members are connected and every non CDS member is connected to at least one<br />

CDS member by definition, so the message will be delivered to every recipient in the network. This<br />

approach is more efficient than simple flooding as only a subset of the nodes forwards the message,<br />

but the properties of the CDS ensures that every node in the cluster will eventually receive the<br />

broadcast information. Here, the notion of CDS parent (or simply parent) must be introduced.<br />

The CDS parent of node A is a node, which is in communication distance with A and is a member<br />

of the CDS.<br />

The complexity of such a broadcast communication is O(N), but actually it takes |S| messages<br />

to broadcast some information, where |S| is the number of nodes in the connected dominating<br />

set. If the CDS algorithm is accurate, than it can be very close to the minimum number of nodes<br />

required to broadcast communication.<br />

In the following, broadcast communication is used frequently to avoid that an attacker can gain<br />

some knowledge about the identity of the aggregators from the traffic patterns inside the network.<br />

Obviously not every message is broadcast in the network, because that would shortly lead to<br />

62


100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

0 20 40 60 80 100<br />

4.4. Advanced protocol<br />

Figure 4.6: Connected dominating set. Solid dots represents the dominating set, and empty circles<br />

represent the remaining nodes. The connections between the non CDS nodes of the network is not<br />

displayed on the figure.<br />

battery depletion and inoperability of the sensor network. Instead of automatically broadcasting<br />

every message, as much information as possible is aggregated in each message to preserve energy.<br />

In the following sections, I will use the given CDS in different ways, and each particular usage will<br />

be described in the corresponding section.<br />

The used communication patterns are closely related to and inspired by the Echo algorithm<br />

published by Chang in [Chang, 2006]. The Echo algorithm is a Wave algorithm [Tel, 2000], which<br />

enables the distributed computation of an idempotent operator in trees. It can be used in arbitrary<br />

connected graphs, and generates a spanning tree as a side result.<br />

4.4.2 Data aggregator election<br />

The main goal of the aggregator node election protocol is to elect a node that can store the<br />

measurements of the whole cluster in a given epoch, but in such a way that the identity remains<br />

hidden. The election is successful if at least one node is elected. The protocol is unsuccessful if<br />

no node is elected, thus no node stores the data. In some cases, electing more than one node can<br />

be advantageous, because the redundant storage can withstand the failure of some nodes. In the<br />

following, I propose an election protocol, where the expected number of elected aggregators can<br />

be determined by the system operator, and the protocol ensures that at least one aggregator is<br />

always elected.<br />

The election process relies on the initialization subprotocol discussed in Section 4.4.1. It requires<br />

an authenticated broadcast channel among the cluster members, which is exactly what the<br />

initialization part offers.<br />

The election process consists of two main steps: (i) Every node decides, whether it wants to<br />

be an aggregator, based on some random values. This step does not need any communication,<br />

the nodes compute the results locally. (ii) In the second step, an anonymous veto protocol is run,<br />

which reveals only the information that at least one node elected itself to be aggregator node. If<br />

no aggregator is elected, it will be clear for every participant, and every participant can run the<br />

election protocol again.<br />

63


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

Step (i) can be implemented easily. Every node elects itself aggregator with a given probability<br />

p. The result of the election is kept secret, the participants only want to know that the number c<br />

of aggregators is not zero, without revealing the identity of the cluster aggregators. This is advantageous,<br />

because in case of node compromise, the attacker learns only whether the compromised<br />

node is an aggregator, but nothing about the identity or the number of the other aggregators. Let<br />

us denote the random variable representing the number of elected aggregators with C. It is easy<br />

to see that the distribution of C is binomial (N is the total number of nodes in one cluster):<br />

Pr(C = c) =<br />

( N<br />

c<br />

)<br />

p c (1 − p) N−c<br />

The expected number of aggregators after the first step is: cE = Np. So if on average ĉ cluster<br />

aggregator is needed, then p should be ĉ<br />

N (this formula will be slightly modified after considering<br />

the results of the second step).<br />

The probability that no cluster aggregator is elected is: (1 − p) N . To avoid this anarchical<br />

situation when no node is elected, the nodes must run step (ii) which proves that at least one node<br />

is elected as aggregator node, but the identity of the aggregator remains secret. This problem can<br />

be solved by an anonymous veto protocol. Such a protocol is suggested by Hao and Zieliński in<br />

[Hao and Zielinski, 2006].<br />

Hao and Zieliński’s approach has many advantageous properties compared to other solutions<br />

[Brandt, 2006; Chaum, 1988], such as it requires only 2 communication rounds.<br />

The anonym veto protocol requires knowledge proofs. Informally, a knowledge proof allows a<br />

prover to convince a verifier that he knows a solution of a hard-to-solve problem without revealing<br />

any useful information about the knowledge. A detailed explanation of the problem can be found<br />

in [Camenisch and Stadler, 1997]<br />

A well known example of knowledge proof is given by Schnorr in [Schnorr, 1991]. The proposed<br />

method gives a non interactive proof of knowledge of a logarithm without revealing the logarithm<br />

itself. The operation can be described briefly as follows. The proof of knowledge of the exponent of<br />

gx i consists of the pair {gv , r = v − xih}, where h = H(g, gv , gx i , i) and H is a secure hash function.<br />

This proof of knowledge can be verified by anyone through checking whether g v and g r g xih are<br />

equal.<br />

The operation of the anonym veto protocol consists of two consecutive rounds (G is a publicly<br />

agreed group with order q and generator g):<br />

1. First, every participant i selects a secret random value: xi ∈ Zq. Then g x i is broadcast with<br />

a knowledge proof. The knowledge proof is needed to ensure that the participant knows xi<br />

without revealing the value of xi. Without the knowledge proof, the node could choose gx i in<br />

a way to influence the result of the protocol (it is widely believed that for a given gx i (mod p)<br />

it is hard to find xi(mod p), this problem is known as the discrete logarithm problem). Then<br />

every participant checks the knowledge proofs, and computes a special product of the received<br />

values:<br />

g yi i−1 ∏<br />

= g xj<br />

/<br />

N∏<br />

j=1<br />

j=i+1<br />

2. g yici is broadcast with a knowledge proof (the knowledge proof is needed to ensure that the<br />

node cannot influence the election maliciously afterwards). ci is set to xi for non aggregators,<br />

while a random ri value for aggregators.<br />

The product P = N∏<br />

gciyi equals to 1 if and only if no cluster aggregator is elected (none vetoed<br />

i=1<br />

the question: Is the number of cluster aggregators elected zero?). If no aggregator is elected, then<br />

it will be clear for all participants, and the election can be done again. If P differs from 1, then<br />

some nodes are announced themselves to be cluster aggregators, and this is known by all the nodes.<br />

64<br />

g xj


4.4. Advanced protocol<br />

If we consider the effect of the second step (new election is run if no aggregator is elected), the<br />

expected number of aggregators is slightly higher than in the case of binomial distributions. The<br />

expected number of aggregators are:<br />

cE =<br />

Np<br />

1 − (1 − p) N<br />

The anonymity of the election subprotocol depends on the parts of the protocol. Obviously,<br />

the random number generation does not leak any information about the identity of the aggregator<br />

nodes, if the random number generator is secure. A cryptographically secure random number<br />

generator, called TinyRNG, is proposed in [Francillon and Castelluccia, 2007] for wireless sensor<br />

networks. Using a secure random number generator, it is unpredictable, who elects itself to be<br />

aggregator node.<br />

The anonymity analysis of the anonym veto protocol can be found in [Hao and Zielinski, 2006].<br />

The anonymity is based on the decisional Diffie-Hellman assumption, which is considered to be a<br />

hard problem.<br />

The message complexity of the election is O(N 2 ), which is acceptable as the election is run<br />

infrequently (N is the number of nodes in the cluster).<br />

If this overhead with the 4 modular exponentiations (see Table 4.3 for the complexities and<br />

Table 4.1 for the estimated running times, note that RSA is based on modular exponentiation)<br />

is too big for the application, then it can use the basic protocol described in Section 4.3.1, where<br />

only symmetric key encryption is used.<br />

In wireless sensor networks, the links in general are not reliable, packet losses occur in time to<br />

time. Reliability can be introduced by the link layer or by the application. As it is crucial to run<br />

the election protocol without any packet loss, it is required to use a reliable link layer protocol for<br />

this subprotocol. Such protocols are suggested in [Iqbal and Khayam, 2009; Wan et al., 2002] for<br />

wireless sensor networks.<br />

As a summary, after the election subprotocol every node is equiprobably aggregator node. The<br />

election subprotocol ensures that at least one aggregator is elected and this node(s) is aware of<br />

its status. An outside attacker does not know the identity of the aggregators or even the actual<br />

number of the elected aggregator nodes. An attacker, who compromised one or more nodes, can<br />

decide whether the compromised nodes are aggregators, but cannot be certain about the other<br />

nodes.<br />

4.4.3 Data aggregation<br />

The main goal of the WSN is to measure some data from the environment, and store the data<br />

for later use. This section describes how the data is forwarded to the aggregator(s) without the<br />

explicit knowledge of the identifier(s) of the aggregator(s).<br />

The data aggregation and storage procedure use the broadcast channel. If the covered area<br />

is so small or the radio range is so large that every node can reach each other directly, then the<br />

aggregation can be implemented simply. Every node broadcasts their measurement to the common<br />

channel, and the cluster aggregator(s) can aggregate and store the measurements. If the covered<br />

area is bigger (which is the more realistic case), a connected dominating set based solution is<br />

proposed.<br />

In each timeslot, each ordinary node (not member of the CDS) sends its measurement to one<br />

neighboring CDS member (to the parent) by unicast communication. When the epoch is elapsed<br />

and all the measurements from the nodes are received, the CDS nodes aggregate the measurements<br />

and use a modification of the Echo algorithm on the given spanning tree to compute the gross<br />

aggregated measurement in the following way: each CDS member waits until all but one CDS<br />

neighbor sends its subaggregate to it, and after some random delay it sends the aggregate to the<br />

remaining neighbor. This means that the leaf nodes of the tree start the communication, and then<br />

the communication wave is propagated towards the root of the spanning tree. This behavior is the<br />

same as the second phase of the Echo algorithm. When one node receives the subaggregates from<br />

all of its neighbors, thus cannot send it to anyone, it can compute the gross aggregated value of<br />

65


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

1;1<br />

3;1<br />

1;1<br />

3;1<br />

2;1<br />

3;1 4;1<br />

4;1<br />

1;1<br />

3;1<br />

2;2<br />

2;2<br />

3;3<br />

4;1<br />

1;1<br />

2.6;5<br />

2.6;5<br />

2.6;5<br />

2.6;5<br />

2.6;5<br />

2.6<br />

2.6;5<br />

2.6 2.6;5<br />

2.6;5<br />

2.6;5<br />

Aggregator<br />

Figure 4.7: Aggregation example. The subfigures from left to right represents the consecutive steps<br />

of an average computation: (i) The measured data is ready to send. It is stored in a format of<br />

actual average; number of data. Non CDS nodes sends the average to their parents. (ii) The CDS<br />

nodes start to send the aggregated value to its parents. (iii) A CDS node receives an aggregate<br />

from all of its neighbors, and starts to broadcast the final aggregated value. Nodes willing to store<br />

the value can do so. (iv) Other CDS nodes receiving the final value rebroadcasts it. Nodes willing<br />

to store the value can do so.<br />

the network. Then, this value is distributed between the cluster members by broadcasting it every<br />

CDS member.<br />

This second phase is needed, so that every member of the cluster can be aware of the gross<br />

aggregated value, and the anonymous aggregators can store it, while the others can simply discard<br />

it. The stored data includes the timeslot in which the aggregate was computed, and the environmental<br />

variables if more than one variable (e.g. temperature and humidity) are recorded besides<br />

the value itself.<br />

The aggregation function can be any statistical function of the measured data. Some easily implementable<br />

and widely used functions are the minimum, maximum, sum or average. In Figure 4.7,<br />

the aggregation protocol is visualized with five nodes and two aggregators using the average as an<br />

aggregation function.<br />

The anonymity analysis of the aggregation subprotocol is quite simple. After the aggregation,<br />

every node possesses the same information as an external attacker can get. This information is<br />

the aggregated data itself, without knowing anything about the identity of the aggregators. If the<br />

operator wants to hide the aggregated data, it can use some techniques discussed in Section 4.5.<br />

The message complexity of the aggregation is O(N), where N is the number of nodes in the<br />

cluster. This is the best complexity achievable, because to store all the measurements by a single<br />

aggregator, all nodes must send the measurements towards the aggregator, which leads to O(N)<br />

message complexity. In terms of latency, the advanced protocol doubles the time the aggregated<br />

measurement arrives to the aggregator compared to a naive system, where the identity of the<br />

aggregators are known to every participant. This latency is acceptable as in most WSN applications<br />

the time between the measurements is much longer than the time required to aggregate the data.<br />

As mentioned in the election subprotocol, the protocol must be prepared to packet losses due<br />

to the nature of wireless sensor networks. In the aggregation subprotocol two kind of packet loss<br />

can be envisioned: a packet can be lost before or after the final aggregate is computed. Both<br />

cases can be detected by timers and a resend request can be sent. If the resend is unsuccessful for<br />

some times, the aggregation must be run without those messages. If the lost message contains a<br />

measurement or subaggregate, then the final aggregate will be computed without that data leading<br />

to an inaccurate measurement. If the lost message contained the gross aggregate, then some nodes<br />

will not receive the gross aggregate. Here it is very useful that the network can have multiple<br />

aggregators, because if at least one aggregator receives the data, the data can be queried by the<br />

operator.<br />

66<br />

CDS


4.4.4 Query<br />

4.4. Advanced protocol<br />

The ultimate goal of the sensor network is to make the measured data available to the operator<br />

upon request. While the aggregation subprotocol ensures that the measured data is stored by the<br />

aggregators, the goal of the query subprotocol is to provide the requested data to the operator and<br />

keep the aggregators’ identity hidden at the same time.<br />

One solution would be that the operator visits all the nodes, and connects to them by wire.<br />

While this solution would leak no information about the identities of the aggregators to any eavesdropping<br />

attacker, the execution would be very time consuming and cumbersome. Moreover, the<br />

accessibility of some nodes may be difficult or dangerous (for example in a military scenario).<br />

Therefore, I propose a solution where it is sufficient for the operator to get in wireless communication<br />

range of any of the nodes. This node does not need to be an aggregator, as actually no one,<br />

not even the operator knows who the aggregator nodes are.<br />

As a first step, the operator authenticates itself to the selected node O using the key kO. After<br />

that, node O starts the query protocol by sending out a query, obtains the response to the query<br />

from the cluster, and makes the response available to the operator. In the following, it is assumed<br />

that O is not a CDS node. (If it is indeed a CDS node, then the first and last transmission of the<br />

query protocol can be omitted.)<br />

Node O broadcasts the query data Q with the help of the CDS nodes in the cluster. This<br />

is done by sending Q to the CDS parent, and then every CDS member rebroadcasts Q as it is<br />

received. The query Q describes what information the operator is interested in. It includes a<br />

variable name, a time interval, and a field for collecting the response to the query. It also includes<br />

a bit, called “aggregated”, which will later be used in the detection of misbehaving nodes. For the<br />

details of misbehaving node detection, the reader is referred to Section 4.4.5; here we assume that<br />

the “aggregated” bit is always set meaning that aggregation is enabled.<br />

The idea of the query protocol is that each node i in the cluster contributes to the response by<br />

a number Ri, which is computed as follows:<br />

{<br />

h(Q|ki), for non-aggregators<br />

Ri =<br />

(4.4)<br />

h(Q|ki) + M, for aggregators<br />

where M is the stored measurement (available only if the node is an aggregator), h is a cryptographic<br />

hash function, and ki is the key shared by node i and the operator. Thus, non-aggregators<br />

contribute with a pseudo-random number h(Q|ki) computed from the query and the key ki, which<br />

can later be also computed by the operator, while aggregator nodes contribute with the sum of<br />

a pseudo-random number and the requested measurement data. The sum is normal fix point<br />

addition, which can overflow if the hash is a large value.<br />

The goal is that the querying node O receives back the sum of all these Ri values. For this<br />

reason, when the query Q is received by a non CDS node from its CDS parent, it computes its<br />

Ri value and sends it back to the CDS parent in the response field of the query token. When a<br />

CDS parent receives back the query tokens with the updated response field from its children, it<br />

computes the sum of the received Ri values and its own, and after inserting the identifiers of the<br />

nodes sends the result back to its parent. This is repeated until the query token reaches back to<br />

the CDS parent of node O, which can forward the response R = ∑ Ri and the list of responding<br />

nodes to node O, where the sum is computed by normal fix point addition. This operation is<br />

illustrated in Figure 4.8.<br />

When receiving R from O, the operator can calculate the stored data as follows. First of all, the<br />

operator can regenerate each hash value h(Q|ki), because it stores (or can compute from a master<br />

key on-the-fly) each key ki, and it knows the original query data Q. The operator can subtract the<br />

hash values from R (note that the responding nodes list is present in the response), and it gets a<br />

result R ′ = cM, where c is the actual number of aggregators in the cluster 2 . Unfortunately, this<br />

number c is unknown to the operator, as it is unknown to everybody else. Nevertheless, if M is<br />

2 Note that each aggregator contributed the measurement M to the response, that is why at the end, the response<br />

will be c times M, where c is the number of aggregators.<br />

67


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

O<br />

Q<br />

Q<br />

Q<br />

Q<br />

O Q<br />

Q<br />

Q<br />

Q Q<br />

O Q<br />

R 1<br />

Q<br />

R 1 +R 2 +R 3<br />

R 2<br />

R +... +R<br />

1 5<br />

O<br />

R +R +R +R<br />

1 2 3 4<br />

Q<br />

R 1 +R 2 +R 3<br />

Figure 4.8: Query example. The subfigures from left to right represents the consecutive steps of a<br />

query: (i) The operator sends the Q query to node O. This node forwards it to its CDS parent.<br />

The CDS parent broadcasts the query. (ii) The CDS nodes broadcasts the query, so every node in<br />

the network is aware of Q. (iii) Every non CDS node (except O) sends it response to its parent. (iv)<br />

The sum of the responses is propagated back to the parent of O (including the list of responding<br />

nodes, not on the figure), who forwards it to the operator through O.<br />

restricted to lie in an interval [A, B] such that the intervals [iA, iB] for i = 1, 2, . . . , N are nonoverlapping,<br />

then cM can fall only into interval [cA, cB], and hence, c can be uniquely determined<br />

by the operator by checking which interval R ′ belongs to. Then, dividing R ′ with c gives the<br />

requested data M.<br />

More specifically, and for practical reasons, the following three criteria need to be satisfied by<br />

the interval [A, B] for my query scheme to work: (i) as we have seen before, for unique decoding<br />

of cM, the intervals [iA, iB] for i = 1, 2, . . . , N must be non-overlapping, (ii) in order to fit in<br />

the messages and to avoid integer overflow 3 , the highest possible value for cM, i.e., NB must be<br />

representable with a pre-specified number L, and (iii) it must be possible to map a pre-specified<br />

number D of different values into [A, B].<br />

The first criterion (i) is met, if the lower end of each interval is larger than the higher end of<br />

the preceding interval:<br />

0 < iA − (i − 1)B = i(A − B) + B, i = 1, . . . , N<br />

Note that if the above inequality holds for i = N, then it holds for every i, because A − B is a<br />

negative constant and B is a positive constant. So it is enough to consider only the case of i = N:<br />

The second criterion (ii) means that<br />

0 < N(A − B) + B<br />

B < N<br />

N−1 A<br />

BN < L<br />

B < L<br />

N<br />

while the third criterion (iii) can be formalized as<br />

D < B − A<br />

B > A + D<br />

Figure 4.9 shows an example for a graphical representation of the three criteria, where the<br />

crossed area represents the admissible (A, B) pairs. It can also be easily seen in this figure that<br />

a solution exists only if the B coordinate of the intersection of inequalities (4.5) and (4.7) meets<br />

criterion (4.6), or in other words<br />

3 In case of overflow, the result is not unique.<br />

68<br />

(4.5)<br />

(4.6)<br />

(4.7)


L<br />

N<br />

D<br />

B<br />

NM < L<br />

N<br />

(4.5)<br />

A<br />

(4.7)<br />

(4.6)<br />

Figure 4.9: Graphical representation of the suitable intervals<br />

4.4. Advanced protocol<br />

As a numerical example, let us assume, that we want to measure at least 100 different values<br />

(D = 99), the micro-controller is a 16 bit controller (L = 2 16 ), and we have at most 20 nodes in<br />

each cluster (N = 20). Then a suitable interval that satisfies all three criteria would be [A, B] =<br />

[2000 − 2100]. Checking that this interval indeed meets the requirements is left for the interested<br />

reader. Finally, note that any real measurement interval can be easily mapped to this interval<br />

[A, B] by simple scaling and shifting operations, and my solution requires that such a mapping is<br />

performed on the real values before the execution of the query protocol.<br />

Our proposed protocol has many advantageous properties. First, the network can respond to<br />

a query if at least one aggregator can successfully participate in the subprotocol. Second, the<br />

operator does not need to know the identity of the aggregators, thus even the operator cannot<br />

leak that information accidentally (although, after receiving the response, the operator learns the<br />

actual number of the aggregator nodes). Third, the protocol does not leak any information about<br />

the identity of the aggregators: an attacker can eavesdrop the query information Q, and the Ri<br />

pseudo random numbers, but cannot deduce from them the identity of the aggregators. Finally,<br />

the message complexity of the query is O(N), where N is the number of nodes in the cluster. This<br />

is the best complexity achievable, when the originator of the query does not know the identity of<br />

the aggregator(s). The latency of the query protocol depends on the longest path of the network<br />

rooted at node O.<br />

As mentioned in the previous subprotocols, the protocol must be prepared to packet losses<br />

due to the nature of wireless sensor networks. Due to the packet losses, the final sum R is the<br />

sum of the responding nodes which is a subset of all nodes. That is why the identifiers must be<br />

included in the responses. The operator can calculate cM independently from the actual subset of<br />

responders. If at least one response from an aggregator gets to the operator, it can calculate M in<br />

the previously described way. If cM = 0, then it is clear for the operator that every aggregators’<br />

response is lost.<br />

4.4.5 Misbehaving nodes<br />

In this section, I look beyond my initial goal. I briefly analyze what happens if a compromised<br />

node deviates from the protocol to achieve some goals other than just learning the identity of the<br />

aggregators.<br />

In the election process, a compromised node may elect itself to be aggregator in every election.<br />

This can be a problem if this node is the only elected aggregator, because a compromised node may<br />

69


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

not store the aggregated values. Unfortunately this situation cannot be avoided in any election<br />

protocol, because an aggregator can be compromised after the election, and the attacker can erase<br />

the memory of that node. Actually my protocol is partially resistant to this attack, because more<br />

than one aggregator may be elected with some probability, and the attacker cannot be sure if the<br />

compromised node is the single aggregator node in the cluster.<br />

During the aggregation, a misbehaving node can modify its readings, or modify the values it<br />

aggregates. The modification of others’ values can be prevented by some broadcast authentication<br />

schemes discussed in Section 4.4.1. The problem of reporting false values can be handled by<br />

statistical approaches discussed in [Buttyán et al., 2006; Wagner, 2004; Buttyán et al., 2009].<br />

The most interesting subprotocol from the perspective of misbehaving nodes is the query protocol.<br />

In this protocol, a compromised node can easily modify the result of the query in the following<br />

way. A compromised node can add an arbitrary number X to the hash in Equation (4.4) instead of<br />

using 0 or M. It is easy to see, that if X is selected from the interval [A, B], then after subtracting<br />

the hashes, the resulting sum R ′ will be an integer in the interval [(c+1)A, (c+1)B] (c is the actual<br />

number of aggregators, c + 1 nodes act like aggregators, the c aggregator and the compromised<br />

node). A compromised node can further increase its influence by choosing X from the interval<br />

[iA, iB]. This means that the resulting sum R ′ will be in the interval [(c + i)A, (ci)B]. If X is<br />

not selected from interval [jA, jB], j = 1 . . . N, then the result can be outside of the decodable<br />

intervals. This can be immediately detected by the operator (see Figure 4.10).<br />

If the result is in a legitimate interval (∃j, R ′ ∈ [jA, jB]), then the operator can further check the<br />

consistency by calculating R ′ mod j. If the result is zero, then it is possible, that no misbehaving<br />

node is present in the network. If the result is non zero, the operator can be sure, that apart<br />

from the zeros and Ms, some node sent a different value, thus a misbehaving node is present in<br />

the network. It is hard for the attacker to guess j, because it neither knows the actual number of<br />

aggregators, nor can calculate R ′ from R by subtracting the unknown hashes.<br />

If the modulus is zero, but the operator is still suspicious about the result, it can further test<br />

the cluster for misbehaving nodes with the help of the aggregated bit in the queries. This further<br />

testing can be done regularly, randomly, or on receiving suspicious results. If the aggregated bit is<br />

cleared in a query Q, then the CDS nodes does not sum the incoming replies, but forward them<br />

towards the agent O node as they are received. So if the operator wants to check if a misbehaving<br />

node is present in the network it can run a query Q with aggregated bit set, and then run the<br />

same query with cleared aggregated bit. If the two results are different, then the operator can be<br />

sure, that a node wants to hide its malicious activity from the operator. If the two sums are equal,<br />

then the operator can further check the results from the second round. If the values are all equal<br />

after subtracting the hashes (not considering the zero values), then no misbehavior is detected,<br />

otherwise some node(s) misbehave in the cluster.<br />

Note here, that this algorithm does not find every misbehavior, but the misbehaviors not<br />

detected by this algorithm does not influence the operator. For example, two nodes can misbehave<br />

such that the first adds S to its hash and the second adds −S. It is clear that this misbehavior<br />

does not effect the result computed by the operator, because S − S = 0. Other misbehavior not<br />

detected by the algorithm if a compromised non aggregator node sends M instead of 0. This is<br />

not detected by the algorithm, but not modifies the result the operator computes. The operation<br />

of misbehavior detection algorithm is depicted on Figure 4.10. This algorithm only detects if some<br />

misbehavior is occurred in the cluster, but does not necessarily find the misbehaving node. I left<br />

the elaboration of this problem for future work.<br />

4.5 Related work<br />

A survey on privacy protection techniques for WSNs is provided in [Li et al., 2009], where they are<br />

classified into two main groups: data-oriented and context oriented protection. In this section, I<br />

briefly review these techniques, with an emphasis on those solutions that are closly related to my<br />

work.<br />

In data-oriented protection, the confidentiality of the measured data must be preserved. It is<br />

70


R ′ = R− N<br />

i=1 h(Q|ki)<br />

ÆÓ Å×ÚÓÖ<br />

R ′ ∃j R<br />

×<br />

mod j = 0 ÆÓ Å×ÚÓÖ<br />

×<br />

ÙÖØÖ<br />

ÆÓ<br />

ÆÓÑ×ÚÓÖ<br />

×<br />

ÊÚRiÚÐÙ×<br />

′ ∈ [jA,jB]<br />

N i=1<br />

×<br />

R′ i = R ′<br />

∃M R ′ i = 0∨R ′ i = M<br />

ÆÓÑ×ÚÓÖ<br />

×<br />

ÆÓ<br />

ÆÓ<br />

ËÒQÕÙÖÝÛØ<br />

ÖØØÐÖ<br />

R ′ i = Ri −h(Q|ki)<br />

Å×ÚÓÖ<br />

Å×ÚÓÖ<br />

Figure 4.10: Misbehavior detection algorithm for the query protocol.<br />

71<br />

4.5. Related work


4. ANONYMOUS AGGREGATOR ELECTION AND DATA AGGREGATION IN WSNS<br />

also a research direction how the operator can verify if the received data is correct. The main<br />

focus is on the confidentiality in [He et al., 2007], while the verification of the received data is also<br />

ensured in [Sheng and Li, 2008].<br />

According to [Li et al., 2009] context oriented protection covers the location privacy of the<br />

source and the base station. The source location privacy is mainly a problem in event driven<br />

networks, where the existence and location of the event is the information, which must be hidden.<br />

The location privacy of the base station is discussed in [Deng et al., 2006b]. The main difference<br />

between hiding the base station and the in network aggregators is that a WSN regularly contains<br />

only one base station which is a predefined node, while at the same time there are more in network<br />

aggregators used in one network, and the nodes used as aggregators are periodically changed.<br />

The problem of private cluster aggregator election in wireless sensor networks is strongly related<br />

to anonym routing in WSNs. The main difference between anonym routing and anonymous<br />

aggregation is that anonym routing supports any traffic pattern and generally handles external attackers,<br />

while anonymous aggregation supports aggregation specific traffic patterns and can handle<br />

compromised nodes as well. In [Seys and Preneel, 2006] an efficient anonymous on demand routing<br />

scheme called ARM is proposed for mobile ad hoc networks. For the same problem another solution<br />

is given in [Zhang et al., 2006] (MASK), where a detailed simulation is also presented for the<br />

proposed protocol. A more efficient solution is given in [Rajendran and Sreenaath, 2008], which<br />

uses low cryptographic overhead, and addresses some drawbacks of the two papers above. In [Choi<br />

et al., 2007] a privacy preserving communication system (PPCS) is proposed. PPCS provides a<br />

comprehensive solution to anonymize communication endpoints, keep the location and identifier<br />

of a node unlinkable, and mask the existence of communication flows.<br />

The security of different aggregator node election protocols is surveyed in [Schaffer et al., 2012].<br />

Most protocols are aiming at no security on the election, or they aim at the non-manipulability<br />

of the election. Such protocols are can withstand passive attacks [Kuhn et al., 2006], or active<br />

attacks as well[Sirivianos et al., 2007; Gicheol, 2010].<br />

4.6 Conclusion<br />

In wireless sensor networks, in-network data aggregation is often used to ensure scalability and<br />

energy efficient operation. However, as we saw, this also introduces some security issues: the<br />

designated aggregator nodes that collect and store aggregated sensor readings and communicate<br />

with the base station are attractive targets of physical node destruction and jamming attacks. In<br />

order to mitigate this problem, in this chapter, I proposed two private aggregator node election<br />

protocols for wireless sensor networks that hide the elected aggregator nodes from the attacker,<br />

who, therefore, cannot locate and disable them. My basic protocol provides fewer guarantees than<br />

my advanced protocol, but it may be sufficient in cases where the risk of physical compromise of<br />

nodes is low. My advanced protocol hides the identity of the elected aggregator nodes even from<br />

insider attackers, thus it handles node compromise attacks too.<br />

I also proposed a private data aggregation protocol and a corresponding private query protocol<br />

for the advanced version, which allow the aggregator nodes to collect sensor readings and respond to<br />

queries of the operator, respectively, without revealing any useful information about their identity.<br />

My aggregation and query protocols are resistant to both external eavesdroppers and compromised<br />

nodes participating in the protocol. The communication in the advanced protocol is based on the<br />

concept of connected dominating set, which suits well to wireless sensor networks.<br />

In this chapter I went beyond the goal of only hiding the identity of the aggregator nodes. I<br />

also analyzed what happens if a malicious node wants to exploit the anonymity offered by the<br />

system, and tries to mislead the operator by injecting false reports. I proposed an algorithm that<br />

can detect if any of the nodes misbehaves in the query phase. I only detect the fact of misbehavior<br />

and leave the identification of the misbehaving node itself for future work.<br />

In general, my protocols increase the dependability of sensor networks, and therefore, they can<br />

be applied in mission critical sensor network applications, including high-confidence cyber-physical<br />

72


4.7. Related publications<br />

systems where sensors and actuators monitor and control the operation of some critical physical<br />

infrastructure.<br />

4.7 Related publications<br />

[Buttyán and Holczer, 2009] Levente Buttyán and Tamas Holczer. Private cluster head election<br />

inwireless sensor networks. In Proceedings of the Fifth IEEE International Workshop on Wireless<br />

and Sensor Networks Security (WSNS 2009), pages 1048–1053. IEEE, IEEE, 2009.<br />

[Buttyán and Holczer, 2010] Levente Buttyán and Tamas Holczer. Perfectly anonymous data<br />

aggregation in wireless sensor networks. In Proceedings of The 7th IEEE International Conference<br />

on Mobile Ad-hoc and Sensor Systems (WSNS 2010), San Francisco, November 2010. IEEE.<br />

[Holczer and Buttyán, 2011] Tamas Holczer and Levente Buttyán. Anonymous aggregator<br />

election and data aggregation in wireless sensor networks. International Journal of Distributed<br />

Sensor Networks, page 18, 2011. Article ID 828414.<br />

[Schaffer et al., 2012] Péter Schaffer, Károly Farkas, Ádám Horváth, Tamás Holczer, and Levente<br />

Buttyán. Secure and reliable clustering in wireless sensor networks: A critical survey. Elsevier<br />

Computer Networks, 2012.<br />

73


Chapter 5<br />

Application of new results<br />

In this dissertation three different wireless network based systems are considered: Radio Frequency<br />

Identification Systems, Vehicular Ad Hoc Networks, and Wireless Sensor Networks. In this chapter,<br />

a brief overview is give, where these systems are used, and how my new results fit in them.<br />

Radio Frequency Identification Systems The application of RFID is very widespread, some<br />

application areas are [Wu et al., 2009; RFID, 2012]:<br />

Payment by mobile phones Many companies like MasterCard or Nokia is working on mobile<br />

phones with embedded RFID capabilities to enable payment by such devices.<br />

Inventory systems RFID systems can provide accurate knowledge of the current inventory,<br />

which helps saving labor cost, and enables self checkout in shops.<br />

Access control RFID tags can be used as identification badges to enable access control in office<br />

buildings, or can be used as tickets in automated fare collection systems.<br />

Transportation and logistics In transportation, RFID tags can help identify cargo, its owner<br />

or destination.<br />

Passport Many countries include RFID tags into passports, to fasten the passport control on the<br />

borders, and to make illegitimate replication harder.<br />

Hospitals and healthcare Hospitals began implanting patients with RFID tags and using RFID<br />

systems, usually for workflow and inventory management [Fisher, 2006].<br />

Libraries Libraries are using RFID to replace the barcodes on library items. An RFID system<br />

may replace or supplement bar codes and may offer another method of inventory management<br />

and self-service checkout by patrons. [Molnar and Wagner, 2004]<br />

Any usage of RFID systems, where the holder of the tag is a human being might breach<br />

the privacy of the holder. The solutions proposed in Chapter 2 can be used in such situations.<br />

An example application is the automated fare collection systems, where the pass for the mass<br />

transportation system can contain an RFID tag. In such a system, the system designer might<br />

consider the usage of key trees or group based private authentication, in particular if the legal<br />

environment requires the usage of some kind of privacy enhancing technology.<br />

Vehicular Ad Hoc Networks The application of Vehicular Ad Hoc Networks is very widespread,<br />

but can be categorized into three main categories: safety related applications, transport efficiency,<br />

and information/entertainment applications [Hartenstein and Laberteaux, 2008; Willke<br />

et al., 2009]. Hundreds of possible applications can be envisioned or are under construction. Such<br />

75


5. APPLICATION OF NEW RESULTS<br />

an application is the cooperative forward collision warning, which help avoiding rear-end collisions<br />

with the use of beacon messages. The traffic efficiency for example can be increased by a traffic<br />

light optimal speed advisory application, which can assists the driver to arrive during a green<br />

phase. An example for the information gathering applications is the ability of remote wireless<br />

diagnosis, which enables to make the state of the vehicle accessible for remote diagnosis.<br />

Most of the safety and traffic efficiency related applications are based on the beacon messages,<br />

which are frequent messages containing the location, heading, identifier, and some other attributes<br />

of the vehicle. These messages can enable the tracking of individual vehicles, which is an undesirable<br />

side effect of the usage of VANETs. This side effect is analyzed in Chapter 3, and a countermeasure<br />

is proposed as well. The countermeasure algorithm is compatible with the framework proposed by<br />

the Car 2 Car Communication Consortium [Consortium, 2012].<br />

Most of the results of Chapter 3 were parts of the results of the SeVeCom 1 European Commission<br />

funded project. The results were delivered to and accepted by the European Commission.<br />

Wireless Sensor Networks Wireless sensor networks can be used in many scenarios. In Chapter<br />

4 I proposed two anonym aggregation schemes, which hides the identity of the aggregator node.<br />

In the following a few applications are given based on [Akyildiz et al., 2002] with a special attention<br />

on the possible need of hiding some special nodes: wireless sensor networks can be an integral part<br />

of military command, control, communications, computing, intelligence, surveillance, reconnaissance<br />

and targeting (C4ISRT) systems, where there is a clear motivation for an attacker to disturb<br />

the normal functioning of the network by eliminating some special nodes. Another example can<br />

be the protection of critical infrastructure. The problem is that some critical infrastructure like<br />

electrical lines or drinking water pipes are so large scale, that it is impossible to protect them<br />

with traditional methods. WSNs can be a possible protection and surveillance system, where the<br />

disturbance of normal operation by the elimination of aggregator nodes must be avoided.<br />

In the above mentioned applications, there is a clear need for aggregation, and the loss of<br />

the aggregator might have undesirable consequences. Hence in these applications, the anonym<br />

aggregator election, aggregation, and query schemes proposed in Chapter 4 can be used.<br />

The goal of the Wireless Sensor and Actuator Networks for Critical Infrastructure Protection<br />

project (WSAN4CIP 2 ), funded by the European Commission, was to make critical infrastructure<br />

more dependable by the use of WSNs. Some of the results of Chapter 4 were integral part of that<br />

project.<br />

In summary, it can be seen that the results of Chapter 2-4 can be used in real applications,<br />

and the problems discussed in the chapters are important for the society.<br />

1 http://www.sevecom.org/ 2 http://www.wsan4cip.eu<br />

76


Chapter 6<br />

Conclusion<br />

In this thesis, I proposed several privacy enhancing protocols for wireless networks. I dealt with<br />

three different types of networks, namely RFID systems, vehicular ad hoc networks, and wireless<br />

sensor networks.<br />

In Chapter 2 I proposed a key-tree and a group based private authentication protocol for RFID<br />

systems. Both approaches use only symmetric key based cryptographic primitives, which well suits<br />

to resource limited RFID systems.<br />

Key-trees provide an efficient solution for private authentication, however, the level of privacy<br />

provided by key-tree based systems decreases considerably if some members are compromised.<br />

This loss of privacy can be minimized by the careful design of the tree. Based on my results<br />

presented in this dissertation, I can conclude that a good practical design principle is to maximize<br />

the branching factor at the first level of the tree such that the resulting tree still respects the<br />

constraint on the maximum authentication delay in the system. Once the branching factor at the<br />

first level is maximized, the tree can be further optimized by maximizing the branching factors<br />

at the successive levels, but the improvement achieved in this way is not really significant; what<br />

really counts is the branching factor at the first level.<br />

In the second part of Chapter 2, I proposed a novel group based private authentication scheme.<br />

I analyzed the proposed scheme and quantified the level of privacy that it provides. I compared<br />

my group based scheme to the key-tree based scheme. I showed that the group based scheme<br />

provides a higher level of privacy than the key-tree based scheme. In addition, the complexity of<br />

the group based scheme for the verifier can be set to be the same as in the key-tree based scheme,<br />

while the complexity for the prover is always smaller in the latter scheme. The primary application<br />

area of my schemes are that of RFID systems, but it can also be used in applications with similar<br />

characteristics (e.g., in wireless sensor networks).<br />

Some possible work that could be done is the usage of different metrics like the entropy based<br />

metric, or the usage of different constraints like the minimal size of the anonymity sets when<br />

selecting a structure like the groups for the users. These new metrics or constraints can make the<br />

resulting optimization problem complex, which can require heuristic solutions as well. A general<br />

framework that could solve the optimization problem for different metrics and constraints could<br />

be a future research direction.<br />

The most criticized part of any key tree or group based solution is the difficulty of the key<br />

update. Hence, a challenging future work could be the implementation of a key update scheme in<br />

a tree based solution.<br />

In the first half of Chapter 3, I studied the effectiveness of changing pseudonyms to provide<br />

location privacy for vehicles in vehicular networks. The approach of changing pseudonyms to<br />

make location tracking more difficult was proposed in prior work, but its effectiveness has not<br />

been investigated yet. In order to address this problem, I defined a model based on the concept of<br />

the mix zone. I assumed that the adversary has some knowledge about the mix zone, and based<br />

77


6. CONCLUSION<br />

on this knowledge, she tries to relate the vehicles that exit the mix zone to those that entered<br />

it earlier. I also introduced a metric to quantify the level of privacy enjoyed by the vehicles in<br />

this model. In addition, I performed extensive simulations to study the behavior of my model in<br />

realistic scenarios. In particular, in my simulation, I used a rather complex road map, generated<br />

traffic with realistic parameters, and varied the strength of the adversary by varying the number of<br />

her monitoring points. My simulation results provided detailed information about the relationship<br />

between the strength of the adversary and the level of privacy achieved by changing pseudonyms.<br />

I abstracted away the frequency with which the pseudonyms are changed, and I simply assumed<br />

that this frequency is high enough so that every vehicle surely changes pseudonym while in the mix<br />

zone. It seems that changing the pseudonyms frequently has some advantages as frequent changes<br />

increase the probability that the pseudonym is changed in the mix zone. On the other hand, the<br />

higher the frequency, the larger the cost that the pseudonym changing mechanism induces on the<br />

system in terms of management of cryptographic material (keys and certificates related to the<br />

pseudonyms). In addition, if for a given frequency, the probability of changing pseudonym in the<br />

mix zone is already close to 1, then there is no sense to increase the frequency further as it will<br />

no longer increase the level of privacy, while it will still increase the cost. Hence, there seems to<br />

be an optimal value for the frequency of the pseudonym change. Unfortunately, this optimal value<br />

depends on the characteristics of the mix zone, which is ultimately determined by the observing<br />

zone of the adversary, which is not known to the system designer.<br />

In the second half of Chapter 3, I proposed a simple and effective privacy preserving scheme,<br />

called SLOW, for VANETs. SLOW requires vehicles to stop sending heartbeat messages below<br />

a given threshold speed (this explains the name SLOW that stands for “silence at low speeds”)<br />

and to change all their identifiers (pseudonyms) after each such silent period. By using SLOW,<br />

the vicinity of intersections and traffic lights become dynamically created mix zones, as there are<br />

usually many vehicles moving slowly at these places at a given moment in time. In other words,<br />

SLOW implicitly ensures a synchronized silent period and pseudonym change for many vehicles<br />

both in time and space, and this makes it effective as a location privacy enhancing scheme. Yet,<br />

SLOW is remarkably simple, and it has further advantages. For instance, it relieves vehicles of<br />

the burden of verifying a potentially large amount of digital signatures when the vehicle density is<br />

large, as this usually happens when the vehicles move slowly in a traffic jam or stop at intersections.<br />

Finally, the risk of a fatal accident at a slow speed is low, and therefore, SLOW does not seriously<br />

impact safety-of-life.<br />

I evaluated SLOW in a specific attacker model that seems to be realistic, and it proved to be<br />

effective in this model, reducing the success rate of tracking a target vehicle from its starting point<br />

to its destination down to the range of 10–30%.<br />

Some future work could be a detailed analysis of the result of SLOW on the safety of vehicles,<br />

or the analysis of the exceptional cases where the vehicles are forced to send a beacon message<br />

below the threshold.<br />

In Chapter 4 I proposed two private aggregation algorithms for wireless sensor networks. In<br />

wireless sensor networks, in-network data aggregation is often used to ensure scalability and energy<br />

efficient operation. However, this also introduces some security issues: the designated aggregator<br />

nodes that collect and store aggregated sensor readings and communicate with the base station<br />

are attractive targets of physical node destruction and jamming attacks. In order to mitigate this<br />

problem, I proposed two private aggregator node election protocols for wireless sensor networks<br />

that hide the elected aggregator nodes from the attacker, who, therefore, cannot locate and disable<br />

them. My basic protocol provides fewer guarantees than my advanced protocol, but it may be<br />

sufficient in cases where the risk of physically compromising nodes is low. My advanced protocol<br />

hides the identity of the elected aggregator nodes even from insider attackers, thus it handles node<br />

compromise attacks too.<br />

I also proposed a private data aggregation protocol and a corresponding private query protocol<br />

for the advanced version, which allow the aggregator nodes to collect sensor readings and respond to<br />

queries of the operator, respectively, without revealing any useful information about their identity.<br />

My aggregation and query protocols are resistant to both external eavesdroppers and compromised<br />

78


6.0. Conclusion<br />

nodes participating in the protocol. The communication in the advanced protocol is based on the<br />

concept of connected dominating set, which suits well to wireless sensor networks.<br />

At the end of Chapter 4 I went beyond the goal of only hiding the identity of the aggregator<br />

nodes. I also analyzed what happens if a malicious node wants to exploit the anonymity offered by<br />

the system, and tries to mislead the operator by injecting false reports. I proposed an algorithm<br />

that can detect if any of the nodes misbehaves in the query phase. I only detect the fact of<br />

misbehavior and leave the identification of the misbehaving node itself for future work. A more<br />

challenging future work is the reduction of the message or computational complexity of the election<br />

subprotocol.<br />

79


List of Acronyms<br />

CA Cluster Aggregator<br />

CDS Connected Dominating Set<br />

CH Cluster Head<br />

DSRC Dedicated Short-Range Communications<br />

ID IDentifier<br />

IR Infrared<br />

MAC Message Authentication Code<br />

OBU On Board unit<br />

RF Radio Frequency<br />

RFID Radio Frequency IDentification<br />

RSA Rivest Shamir Adleman algorithm<br />

RSU Road Side Unit<br />

SEVECOM Secure Vehicular Communication<br />

SLOW Silence at LOW speeds<br />

SUMO Simulation of Urban MObility<br />

TTL Time To Live<br />

VANET Vehicular Ad Hoc Network<br />

VIN Vehicle Identification Number<br />

WSAN4CIP Wireless Sensor and Actuator Networks for Critical Infrastructure Protection<br />

WSN Wireless Sensor Network<br />

81


List of publications<br />

[Avoine et al., 2007] Gildas Avoine, Levente Buttyan, Tamas Holczer, and Istvan Vajda. Groupbased<br />

private authentication. In Proceedings of the International Workshop on Trust, Security,<br />

and Privacy for Ubiquitous Computing (TSPUC 2007). IEEE, 2007.<br />

[Buttyán and Holczer, 2009] Levente Buttyán and Tamas Holczer. Private cluster head election<br />

inwireless sensor networks. In Proceedings of the Fifth IEEE International Workshop on Wireless<br />

and Sensor Networks Security (WSNS 2009), pages 1048–1053. IEEE, IEEE, 2009.<br />

[Buttyán and Holczer, 2010] Levente Buttyán and Tamas Holczer. Perfectly anonymous data aggregation<br />

in wireless sensor networks. In Proceedings of The 7th IEEE International Conference<br />

on Mobile Ad-hoc and Sensor Systems (WSNS 2010), San Francisco, November 2010. IEEE.<br />

[Buttyan et al., 2004] Levente Buttyan, Tamas Holczer, and Peter Schaffer. Incentives for cooperation<br />

in multi-hop wireless networks. Híradástechnika, LIX(3):30–34, March 2004. (in Hungarian).<br />

[Buttyan et al., 2005] Levente Buttyan, Tamas Holczer, and Peter Schaffer. Spontaneous cooperation<br />

in multi-domain sensor networks. In Proceedings of the 2nd European Workshop on Security<br />

and Privacy in Ad-hoc and Sensor Networks (ESAS), Visegrád, Hungary, July 2005. Springer.<br />

[Buttyan et al., 2006a] Levente Buttyan, Tamas Holczer, and Istvan Vajda. Optimal key-trees<br />

for tree-based private authentication. In Proceedings of the International Workshop on Privacy<br />

Enhancing Technologies (PET), June 2006. Springer.<br />

[Buttyan et al., 2006b] Levente Buttyan, Tamas Holczer, and Istvan Vajda. Providing location<br />

privacy in automated fare collection systems. In Proceedings of the 15th IST Mobile and Wireless<br />

Communication Summit, Mykonos, Greece, June 2006.<br />

[Buttyan et al., 2007] Levente Buttyan, Tamas Holczer, and Istvan Vajda. On the effectiveness<br />

of changing pseudonyms to provide location privacy in vanets. In Proceedings of the Fourth<br />

European Workshop on Security and Privacy in Ad hoc and Sensor Networks (ESAS2007).<br />

Springer, 2007.<br />

[Buttyan et al., 2009] Levente Buttyan, Tamas Holczer, Andre Weimerskirch, and William Whyte.<br />

Slow: A practical pseudonym changing scheme for location privacy in vanets. In Proceedings of<br />

the IEEE Vehicular Networking Conference, pages 1–8. IEEE, IEEE, October 2009.<br />

[Dora and Holczer, 2010] Laszlo Dora and Tamas Holczer. Hide-and-lie: Enhancing applicationlevel<br />

privacy in opportunistic networks. In Proceedings of the Second International Workshop<br />

on Mobile Opportunistic Networking ACM/SIGMOBILE MobiOpp 2010, Pisa, Italy, February<br />

22-23 2010.<br />

[Dvir et al., 2011] Amit Dvir, Tamas Holczer, and Levente Buttyán. Vera - version number and<br />

rank authentication in rpl. In Proceedings of the 7th IEEE International Workshop on Wireless<br />

and Sensor Networks Security (WSNS 2011). IEEE, 2011.<br />

83


LIST OF PUBLICATIONS<br />

[Holczer and Buttyán, 2011] Tamas Holczer and Levente Buttyán. Anonymous aggregator election<br />

and data aggregation in wireless sensor networks. International Journal of Distributed Sensor<br />

Networks, page 18, 2011. Article ID 828414.<br />

[Holczer et al., 2009] Tamas Holczer, Petra Ardelean, Naim Asaj, Stefano Cosenza, Michael Müter,<br />

Albert Held, Björn Wiedersheim, Panagiotis Papadimitratos, Frank Kargl, and Danny De Cock.<br />

Secure vehicle communication (sevecom). Demonstration. Mobisys, June 2009.<br />

[Papadimitratos et al., 2008] Panagiotis Papadimitratos, Antonio Kung, Frank Kargl, Zhendong<br />

Ma, Maxim Raya, Julien Freudiger, Elmar Schoch, Tamas Holczer, Levente Buttyán, and Jean<br />

pierre Hubaux. Secure vehicular communication systems: design and architecture. IEEE Communications<br />

Magazine, 46(11):100–109, 2008.<br />

[Schaffer et al., 2012] Péter Schaffer, Károly Farkas, Ádám Horváth, Tamás Holczer, and Levente<br />

Buttyán. Secure and reliable clustering in wireless sensor networks: A critical survey. Computer<br />

Networks, 2012.<br />

84


Bibliography<br />

[Abadi and Fournet, 2004] M. Abadi and C. Fournet. Private authentication. Theoretical Computer<br />

Science, 322(3):427–476, 2004.<br />

[Akyildiz et al., 2002] I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. Wireless<br />

sensor networks: a survey. Computer networks, 38(4):393–422, 2002.<br />

[Anderson and Kuhn, 1996] R. Anderson and M. Kuhn. Tamper resistance: a cautionary note. In<br />

Proceedings of the 2nd conference on Proceedings of the Second USENIX Workshop on Electronic<br />

Commerce-Volume 2, page 1. USENIX Association, 1996.<br />

[Aoki and Fujii, 1996] M. Aoki and H. Fujii. Inter-vehicle communication: Technical issues on<br />

vehicle control application. Communications Magazine, IEEE, 34(10):90–93, 1996.<br />

[Armknecht et al., 2007] F. Armknecht, A. Festag, D. Westhoff, and K. Zeng. Cross-layer privacy<br />

enhancement and non-repudiation in vehicular communication. In 4th Workshop on Mobile<br />

Ad-Hoc Networks (WMAN), 2007.<br />

[ASV, ] Advanced safety vehicle program. ”http://www.ahsra.or.jp/demo2000/eng/demo_e/<br />

ahs_e7/iguchi/iguchi.html”.<br />

[Avoine and Oechslin, 2005] G. Avoine and P. Oechslin. A scalable and provably secure hash-based<br />

rfid protocol. In Pervasive Computing and Communications Workshops, 2005. PerCom 2005<br />

Workshops. Third IEEE International Conference on, pages 110–114. IEEE, 2005.<br />

[Avoine et al., 2005] G. Avoine, E. Dysli, and P. Oechslin. Reducing time complexity in rfid systems.<br />

In Proceedings of the 12th Annual Workshop on Selected Areas in Cryptography (SAC’05),<br />

pages 291–306. Springer, 2005.<br />

[Avoine, 2012] Gildas Avoine. Bibliography on security and privacy in rfid systems.<br />

http://www.epfl.ch/*gavoine/rfid/, 2012.<br />

[Baruya, 1998] A. Baruya. Speed-accident relationship on different kinds of european roads. MAS-<br />

TER Deliverable 7, September 1998.<br />

[Beresford and Stajano, 2003] A.R. Beresford and F. Stajano. Location privacy in pervasive computing.<br />

Pervasive Computing, IEEE, 2(1):46–55, 2003.<br />

[Beresford and Stajano, 2004] A.R. Beresford and F. Stajano. Mix zones: User privacy in locationaware<br />

services. In Pervasive Computing and Communications Workshops, 2004. Proceedings of<br />

the Second IEEE Annual Conference on, pages 127–131. IEEE, 2004.<br />

[Berki, 2008] Z. Berki. Development of Traffic Models on the basis of Passanger Demand Surveys<br />

<strong>Thesis</strong> of the PhD dissertation. PhD thesis, Budapest University of Technology and Economics,<br />

2008.<br />

85


BIBLIOGRAPHY<br />

[Beye and Veugen, 2011] M. Beye and T. Veugen. Improved anonymity for key-trees? Technical<br />

report, Cryptology ePrint Archive, Report 2011/395, 2011.<br />

[Beye and Veugen, 2012] M. Beye and T. Veugen. Anonymity for key-trees with adaptive adversaries.<br />

Security and Privacy in Communication Networks, pages 409–425, 2012.<br />

[Black and McGrew, 2008] David L. Black and David A. McGrew. The internet key exchange<br />

(ikev2) protocol. 2008.<br />

[Blum et al., 2004a] Jeremy Blum, Min Ding, Andrew Thaeler, and Xiuzhen Cheng. Connected<br />

dominating set in sensor networksand manets. In D.-Z. Du and P. Pardalos, editors, Handbook<br />

of Combinatorial Optimization, pages 329–369. Kluwer Academic Publishers, 2004.<br />

[Blum et al., 2004b] J.J. Blum, A. Eskandarian, and L.J. Hoffman. Challenges of intervehicle ad<br />

hoc networks. Intelligent Transportation Systems, IEEE Transactions on, 5(4):347–351, 2004.<br />

[Bono et al., 2005] S. Bono, M. Green, A. Stubblefield, A. Juels, A. Rubin, and M. Szydlo. Security<br />

analysis of a cryptographically-enabled rfid device. In 14th USENIX Security Symposium,<br />

volume 1, page 16, 2005.<br />

[Boyd and Mathuria, 2003] C. Boyd and A. Mathuria. Protocols for authentication and key establishment.<br />

Springer Verlag, 2003.<br />

[Brandt, 2006] F. Brandt. Efficient cryptographic protocol design based on distributed El Gamal<br />

encryption. Lecture Notes in Computer Science, 3935:32, 2006.<br />

[Buttyán and Holczer, 2010] Levente Buttyán and Tamas Holczer. Perfectly anonymous data aggregation<br />

in wireless sensor networks. In Proceedings of the Sixth IEEE International Workshop<br />

on Wireless and Sensor Networks Security (WSNS’10). IEEE, IEEE, 2010.<br />

[Buttyán and Hubaux, 2008] Levente Buttyán and Jean Pierre Hubaux. Security and Cooperation<br />

in Wireless Networks. Cambridge University Press, 2008.<br />

[Buttyán and Schaffer, 2010] Levente Buttyán and Peter Schaffer. Panel: Position-based aggregator<br />

node election in wireless sensor networks. International Journal of Distributed Sensor<br />

Networks, 2010.<br />

[Buttyán et al., 2006] Levente Buttyán, Peter Schaffer, and István Vajda. Ranbar: Ransac-based<br />

resilient aggregation in sensor networks. In In Proceedings of the Fourth ACM Workshop on<br />

Security of Ad Hoc and Sensor Networks (SASN), Alexandria, VA, USA, October 2006. ACM<br />

Press.<br />

[Buttyán et al., 2009] Levente Buttyán, Peter Schaffer, and István Vajda. Cora: Correlation-based<br />

resilient aggregation in sensor networks. Elsevier Ad Hoc Networks, 7(6):1035–1050, 2009.<br />

[Calandriello et al., 2007] Giorgio Calandriello, Panos Papadimitratos, Jean-Pierre Hubaux, and<br />

Antonio Lioy. Efficient and robust pseudonymous authentication in vanet. In VANET ’07:<br />

Proceedings of the fourth ACM international workshop on Vehicular ad hoc networks, pages<br />

19–28, New York, NY, USA, 2007. ACM.<br />

[Camenisch and Lysyanskaya, 2001] J. Camenisch and A. Lysyanskaya. An efficient system for<br />

non-transferable anonymous credentials with optional anonymity revocation. Advances in<br />

Cryptology-EUROCRYPT 2001, pages 93–118, 2001.<br />

[Camenisch and Stadler, 1997] Jan Camenisch and Markus Stadler. Proof systems for general<br />

statements about discrete logarithms. Technical report, Department of Computer Science, ETH<br />

Zürich, 1997.<br />

86


Bibliography<br />

[Carbunar et al., 2007] B. Carbunar, Y. Yu, L. Shi, M. Pearce, and V. Vasudevan. Query privacy<br />

in wireless sensor networks. In Sensor, Mesh and Ad Hoc Communications and Networks, 2007.<br />

SECON’07. 4th Annual IEEE Communications Society Conference on, pages 203–212. IEEE,<br />

2007.<br />

[Chan and Perrig, 2003] H. Chan and A. Perrig. Security and privacy in sensor networks. Computer,<br />

36(10):103–105, 2003.<br />

[Chan et al., 2003] H. Chan, A. Perrig, and D. Song. Random key predistribution schemes for<br />

sensor networks. In IEEE Symposium on Security and Privacy, pages 197–215. IEEE Computer<br />

Society, 2003.<br />

[Chang, 2006] E.J.H. Chang. Echo algorithms: Depth parallel operations on general graphs. Software<br />

Engineering, IEEE Transactions on, (4):391–401, 2006.<br />

[Chaum, 1981] D.L. Chaum. Untraceable electronic mail, return addresses, and digital pseudonyms.<br />

Communications of the ACM, 24(2):84–90, 1981.<br />

[Chaum, 1988] D. Chaum. The dining cryptographers problem: Unconditional sender and recipient<br />

untraceability. Journal of Cryptology, 1(1):65–75, 1988.<br />

[Chisalita and Shahmehri, 2002] L. Chisalita and N. Shahmehri. A peer-to-peer approach to vehicular<br />

communication for the support of traffic safety applications. In Intelligent Transportation<br />

Systems, 2002. Proceedings. The IEEE 5th International Conference on, pages 336–341. IEEE,<br />

2002.<br />

[Choi et al., 2005] J.Y. Choi, M. Jakobsson, and S. Wetzel. Balancing auditability and privacy in<br />

vehicular networks. In Proceedings of the 1st ACM international workshop on Quality of service<br />

& security in wireless and mobile networks, pages 79–87. ACM, 2005.<br />

[Choi et al., 2007] H. Choi, P. McDaniel, and TF La Porta. Privacy Preserving Communication<br />

in MANETs. In 4th Annual IEEE Communications Society Conference on Sensor, Mesh and<br />

Ad Hoc Communications and Networks, pages 233–242, 2007.<br />

[COM, ] Communications for esafety. ”http://www.comesafety.org/”.<br />

[Consortium, 2012] Car 2 Car Communication Consortium. ”http://www.car-to-car.org”,<br />

2012.<br />

[Deng et al., 2005] J. Deng, R. Han, and S. Mishra. Countermeasures against traffic analysis<br />

attacks in wireless sensor networks. In Security and Privacy for Emerging Areas in Communications<br />

Networks, 2005. SecureComm 2005. First International Conference on, pages 113–126.<br />

IEEE, 2005.<br />

[Deng et al., 2006a] J. Deng, R. Han, and S. Mishra. Decorrelating wireless sensor network traffic<br />

to inhibit traffic analysis attacks. Pervasive and Mobile Computing, 2(2):159–186, 2006.<br />

[Deng et al., 2006b] J. Deng, R. Han, and S. Mishra. Decorrelating wireless sensor network traffic<br />

to inhibit traffic analysis attacks. Pervasive and Mobile Computing, 2(2):159–186, 2006.<br />

[Diaz et al., 2002] C. Diaz, S. Seys, J. Claessens, and B. Preneel. Towards measuring anonymity.<br />

In Proceedings of the 2nd international conference on Privacy enhancing technologies, pages<br />

54–68. Springer-Verlag, 2002.<br />

[Dingledine et al., 2004] R. Dingledine, N. Mathewson, and P. Syverson. Tor: The secondgeneration<br />

onion router. Technical report, DTIC Document, 2004.<br />

[Dötzer, 2006] F. Dötzer. Privacy issues in vehicular ad hoc networks. In Privacy Enhancing<br />

Technologies, pages 197–209. Springer, 2006.<br />

87


BIBLIOGRAPHY<br />

[El Zarki et al., 2002] M. El Zarki, S. Mehrotra, G. Tsudik, and N. Venkatasubramanian. Security<br />

issues in a future vehicular network. In European Wireless, volume 2, 2002.<br />

[Faizulkhakov, 2007] Ya. R. Faizulkhakov. Time synchronization methods for wireless sensor networks:<br />

A survey. Programming and Computing Software, 33(4):214–226, 2007.<br />

[Fisher, 2006] J.A. Fisher. Indoor positioning and digital management. Surveillance and security:<br />

Technological politics and power in everyday life, page 77, 2006.<br />

[Fishkin et al., 2005] K. Fishkin, S. Roy, and B. Jiang. Some methods for privacy in rfid communication.<br />

Security in ad-hoc and sensor networks, pages 42–53, 2005.<br />

[Floerkemeier et al., 2005] C. Floerkemeier, R. Schneider, and M. Langheinrich. Scanning with<br />

a purpose–supporting the fair information principles in rfid protocols. Ubiquitous Computing<br />

Systems, pages 214–231, 2005.<br />

[Francillon and Castelluccia, 2007] Aurélien Francillon and Claude Castelluccia. TinyRNG: A<br />

cryptographic random number generator for wireless sensors network nodes. In Modeling and<br />

Optimization in Mobile, Ad Hoc and Wireless Networks and Workshops, 2007. WiOpt 2007. 5th<br />

International Symposium on, pages 1–7, April 2007.<br />

[Freudiger et al., 2007] J. Freudiger, M. Raya, M. Felegyházi, P. Papadimitratos, and J.-P.<br />

Hubaux. Mix-zones for location privacy in vehicular networks. In Proceedings of the 1st International<br />

Workshop on Wireless Networking for Intelligent Transportation Systems (WiN-ITS<br />

07), 2007.<br />

[Ganesan et al., 2003] P. Ganesan, R. Venugopalan, P. Peddabachagari, A. Dean, F. Mueller, and<br />

M. Sichitiu. Analyzing and modeling encryption overhead for sensor network nodes. In Proceedings<br />

of the 2nd ACM international conference on Wireless sensor networks and applications,<br />

Sep. 2003.<br />

[Gerlach, 2006] M. Gerlach. Assessing and improving privacy in vanets. ESCAR Embedded Security<br />

in Cars, 2006.<br />

[Gicheol, 2010] W. Gicheol. Secure cluster head election using mark based exclusion in wireless<br />

sensor networks. IEICE transactions on communications, 93(11):2925–2935, 2010.<br />

[Goldsmith, 2005] Andrea Goldsmith. Wireless Communications. Cambridge University Press,<br />

New York, NY, USA, 2005.<br />

[Gruteser and Hoh, 2005] M. Gruteser and B. Hoh. On the anonymity of periodic location samples.<br />

In Proceedings of the Second International Conference on Security in Pervasive Computing, pages<br />

179–192. Springer, 2005.<br />

[Gulcu and Tsudik, 1996] C. Gulcu and G. Tsudik. Mixing e-mail with babel. In Network and<br />

Distributed System Security, 1996., Proceedings of the Symposium on, pages 2–16. IEEE, 1996.<br />

[Hancke, 2005] G.P. Hancke. A practical relay attack on iso 14443 proximity cards. Technical<br />

report, University of Cambridge Computer Laboratory, 2005.<br />

[Hao and Zielinski, 2006] F. Hao and P. Zielinski. A 2-round anonymous veto protocol. In Proceedings<br />

of the 14th International Workshop on Security Protocols, Cambridge, UK, 2006.<br />

[Harkins and Carrel, 1998] D. Harkins and D. Carrel. The internet key exchange (ike)protocol.<br />

1998.<br />

[Hartenstein and Laberteaux, 2008] H. Hartenstein and K.P. Laberteaux. A tutorial survey on<br />

vehicular ad hoc networks. Communications Magazine, IEEE, 46(6):164 –171, June 2008.<br />

88


Bibliography<br />

[He et al., 2007] W. He, X. Liu, H. Nguyen, K. Nahrstedt, and T. Abdelzaher. Pda: Privacypreserving<br />

data aggregation in wireless sensor networks. In Proceedings of Infocom, pages 2045–<br />

2053. IEEE, 2007.<br />

[Heinzelman et al., 2000] WR Heinzelman, A. Chandrakasan, and H. Balakrishnan. Energyefficient<br />

communication protocol for wireless microsensor networks. In Proceedings of the 33rd<br />

Annual Hawaii International Conference onSystem Sciences., page 10, 2000.<br />

[Hu and Wang, 2005] Y.C. Hu and H.J. Wang. A framework for location privacy in wireless networks.<br />

In ACM SIGCOMM Asia Workshop. Citeseer, 2005.<br />

[Hu et al., 2005] Y.C. Hu, A. Perrig, and D.B. Johnson. Ariadne: A secure on-demand routing<br />

protocol for ad hoc networks. Wireless Networks, 11(1-2):21–38, 2005.<br />

[Hu et al., 2006] Y.C. Hu, A. Perrig, and D.B. Johnson. Wormhole attacks in wireless networks.<br />

Selected Areas in Communications, IEEE Journal on, 24(2):370–380, 2006.<br />

[Huang et al., 2005] L. Huang, K. Matsuura, H. Yamane, and K. Sezaki. Enhancing wireless<br />

location privacy using silent period. In Wireless Communications and Networking Conference,<br />

2005 IEEE, volume 2, pages 1187–1192. IEEE, 2005.<br />

[Huang et al., 2009] Y. Huang, W. He, and K. Nahrstedt. ChainFarm: A Novel Authentication<br />

Protocol for High-rate Any Source Probabilistic Broadcast. In Proc. of The 6th IEEE International<br />

Conference on Mobile Ad-hoc and Sensor Systems (IEEE MASS), 2009.<br />

[Hubaux et al., 2004] J.P. Hubaux, S. Capkun, and J. Luo. The security and privacy of smart<br />

vehicles. Security & Privacy, IEEE, 2(3):49–55, 2004.<br />

[Instruments, 2005] Texas Instruments. Securing the pharmaceutical supply chain with rfid and<br />

public-key infrastructure (pki) technologies. texas instruments white paper, june 2005, 2005.<br />

[Iqbal and Khayam, 2009] Adnan Iqbal and Syed Ali Khayam. An energy-efficient link layer protocol<br />

for reliable transmission over wireless networks. EURASIP J. Wirel. Commun. Netw.,<br />

2009:28:1–28:10, January 2009.<br />

[ISO, 2008] Iso 9798-2. mechanisms using symmetric encipherment algorithms. 2008.<br />

[ITLaw, ] ITLaw. Right of privacy. ”http://itlaw.wikia.com/wiki/Right_of_privacy”.<br />

[Jacquet, 2004] Philippe Jacquet. Performance of connected dominating set in olsr protocol. Technical<br />

Report RR-5098, INRIA, 2004.<br />

[Jian et al., 2007] Y. Jian, S. Chen, Z. Zhang, and L. Zhang. Protecting receiver-location privacy in<br />

wireless sensor networks. In INFOCOM 2007. 26th IEEE International Conference on Computer<br />

Communications. IEEE, pages 1955–1963. Ieee, 2007.<br />

[Juels and Brainard, 2004] A. Juels and J. Brainard. Soft blocking: Flexible blocker tags on the<br />

cheap. In Proceedings of the 2004 ACM workshop on Privacy in the electronic society, pages<br />

1–7. ACM, 2004.<br />

[Juels et al., 2003] A. Juels, R.L. Rivest, and M. Szydlo. The blocker tag: Selective blocking of<br />

rfid tags for consumer privacy. In Proceedings of the 10th ACM conference on Computer and<br />

communications security, pages 103–111. ACM, 2003.<br />

[Juels et al., 2006] A. Juels, P. Syverson, and D. Bailey. High-power proxies for enhancing rfid<br />

privacy and utility. In Privacy Enhancing Technologies, pages 210–226. Springer, 2006.<br />

[Juels, 2005a] A. Juels. Minimalist cryptography for low-cost rfid tags. Security in Communication<br />

Networks, pages 149–164, 2005.<br />

89


BIBLIOGRAPHY<br />

[Juels, 2005b] A. Juels. Strengthening epc tags against cloning. In Proceedings of the 4th ACM<br />

workshop on Wireless security, pages 67–76. ACM, 2005.<br />

[Juels, 2006] A. Juels. Rfid security and privacy: A research survey. Selected Areas in Communications,<br />

IEEE Journal on, 24(2):381–394, 2006.<br />

[Kamat et al., 2005] P. Kamat, Y. Zhang, W. Trappe, and C. Ozturk. Enhancing source-location<br />

privacy in sensor network routing. In Distributed Computing Systems, 2005. ICDCS 2005.<br />

Proceedings. 25th IEEE International Conference on, pages 599–608. IEEE, 2005.<br />

[Kamat et al., 2007] P. Kamat, W. Xu, W. Trappe, and Y. Zhang. Temporal privacy in wireless<br />

sensor networks. In Distributed Computing Systems, 2007. ICDCS’07. 27th International<br />

Conference on, pages 23–23. IEEE, 2007.<br />

[Kargl et al., 2008] Frank Kargl, Antonio Kung, Albert Held, Giorgo Calandriello, Ta Vinh Thong,<br />

Björn Wiedersheim, Elmar Schoch, Michael Müter, Levente Buttyán, Panagiotis Papadimitratos,<br />

and Jean-Pierre Hubaux. Secure vehicular communication systems: implementation,<br />

performance, and research challenges. IEEE Communications Magazine, 46(11):110–118, 2008.<br />

[Karnadi et al., 2005] F.K. Karnadi, Z.H. Mo, and K. Lan. Rapid generation of realistic mobility<br />

models for vanet. In Wireless Communications and Networking Conference, 2007. WCNC 2007.<br />

IEEE, pages 2506–2511. IEEE, 2005.<br />

[Kelly and Erickson, 2005] E.P. Kelly and G.S. Erickson. Rfid tags: commercial applications v.<br />

privacy rights. Industrial Management & Data Systems, 105(6):703–713, 2005.<br />

[Kesdogan et al., 1998] D. Kesdogan, J. Egner, and R. Büschkes. Stop-and-go-mixes providing<br />

probabilistic anonymity in an open system. In Information Hiding, pages 83–98. Springer, 1998.<br />

[Kfir and Wool, 2005] Z. Kfir and A. Wool. Picking virtual pockets using relay attacks on contactless<br />

smartcard. In Security and Privacy for Emerging Areas in Communications Networks,<br />

2005. SecureComm 2005. First International Conference on, pages 47–58. IEEE, 2005.<br />

[Kloeden et al., 1997] C.N. Kloeden, A.J. McLean, V.M. Moore, and G. Ponte. Travelling speed<br />

and the risk of crash involvement. NHMRC Road Accident Research Unit, The University of<br />

Adelaide, 1997.<br />

[Kohl and Neuman, 1993] J. Kohl and C. Neuman. Rfc 1510: The kerberos network authentication<br />

service (v5). Published Sep, 1993.<br />

[Krajzewicz et al., 2002] Daniel Krajzewicz, Georg Hertkorn, Christian Rössel, and Peter Wagner.<br />

Sumo (simulation of urban mobility); an open-source traffic simulation. In A Al-Akaidi, editor,<br />

Proceedings of the 4th Middle East Symposium on Simulation and Modelling (MESM2002), pages<br />

183–187, Sharjah, United Arab Emirates, September 2002. SCS European Publishing House.<br />

[Kroh et al., 2006] Rainer Kroh, Antonio Kung, and Frank Kargl. Vanets security requirements<br />

final v ersion. Sevecom D1.1, 2006.<br />

[Kruskal, 1956] Jr. Kruskal, Joseph B. On the shortest spanning subtree of a graph and the<br />

traveling salesman problem. Proceedings of the American Mathematical Society, 7(1):pp. 48–50,<br />

1956.<br />

[Kuhn et al., 2006] F. Kuhn, T. Moscibroda, and R. Wattenhofer. Fault-tolerant clustering in ad<br />

hoc and sensor networks. In Distributed Computing Systems, 2006. ICDCS 2006. 26th IEEE<br />

International Conference on, pages 68–68. IEEE, 2006.<br />

[Langheinrich, 2009] M. Langheinrich. A survey of rfid privacy approaches. Personal and Ubiquitous<br />

Computing, 13(6):413–421, 2009.<br />

90


Bibliography<br />

[Leaf and Preusser, 1999] W.A. Leaf and D.F. Preusser. Literature review on vehicle<br />

travel speeds and pedestrian injuries. National Highway Traffic Safety Administration,<br />

http://www.nhtsa.dot.gov/people/injury/research/ pub/HS809012.html, October 1999.<br />

[Li et al., 2009] N. Li, N. Zhang, S.K. Das, and B. Thuraisingham. Privacy preservation in wireless<br />

sensor networks: A state-of-the-art survey. Ad Hoc Networks, 2009.<br />

[Lin and Lu, 2012] Xiaodong Lin and Rongxing Lu. Bibliography on secure vehicular communications.<br />

http://bbcr.uwaterloo.ca/ rxlu/sevecombib.htm, 2012.<br />

[Lin et al., 2008] X. Lin, R. Lu, C. Zhang, H. Zhu, P.H. Ho, and X. Shen. Security in vehicular<br />

ad hoc networks. Communications Magazine, IEEE, 46(4):88–95, 2008.<br />

[Liu and Ning, 2008] An Liu and Peng Ning. Tinyecc: A configurable library for elliptic curve<br />

cryptography in wireless sensor networks. In Proceedings of the 7th International Conference on<br />

Information Processing in Sensor Networks (IPSN 2008), pages 245–256, April 2008.<br />

[Liu et al., 2005] D. Liu, P. Ning, S. Zhu, and S. Jajodia. Practical broadcast authentication in sensor<br />

networks. In Mobile and Ubiquitous Systems: Networking and Services, 2005. MobiQuitous<br />

2005. The Second Annual International Conference on, pages 118–129, 2005.<br />

[Lopez and Zhou, 2008] J. Lopez and J. Zhou. Wireless Sensor Network Security. Cryptology and<br />

Information Security Series, IOS Press, 2008.<br />

[Lopez, 2008] J. Lopez. Wireless sensor network security, volume 1. Ios Pr Inc, 2008.<br />

[Lu et al., 2012] R. Lu, X. Li, T.H. Luan, X. Liang, and X. Shen. Pseudonym changing at social<br />

spots: An effective strategy for location privacy in vanets. Vehicular Technology, IEEE<br />

Transactions on, 61(1):86–96, 2012.<br />

[Luo and Hubaux, 2004] J. Luo and J.P. Hubaux. A survey of inter-vehicle communication. Lausanne,<br />

Switzerland, Tech. Rep, IC/2004/24, 2004.<br />

[Ma et al., 2010] Z. Ma, F. Kargl, and M. Weber. Measuring long-term location privacy in vehicular<br />

communication systems. Computer Communications, 33(12):1414–1427, 2010.<br />

[McMillin et al., 1998] B. McMillin, J. Sirois, R. Mahoney, and F. Budd. Fault-tolerant and secure<br />

intelligent vehicle highway system software a safety prototype. In IEEE International Conference<br />

on Intelligent Vehicles. IEEE, 1998.<br />

[Mehta et al., 2007] K. Mehta, D. Liu, and M. Wright. Location privacy in sensor networks against<br />

a global eavesdropper. In Network Protocols, 2007. ICNP 2007. IEEE International Conference<br />

on, pages 314–323. IEEE, 2007.<br />

[mir, ] http://www.shamus.ie/.<br />

[Molnar and Wagner, 2004] D. Molnar and D. Wagner. Privacy and security in library rfid: Issues,<br />

practices, and architectures. In Proceedings of the 11th ACM conference on Computer and<br />

communications security, pages 210–219. ACM, 2004.<br />

[Nohara et al., 2005] Y. Nohara, S. Inoue, K. Baba, and H. Yasuura. Quantitative evaluation of<br />

unlinkable id matching schemes. In Proceedings of the 2005 ACM workshop on Privacy in the<br />

electronic society, pages 55–60. ACM, 2005.<br />

[Ohkubo et al., 2004] M. Ohkubo, K. Suzuki, and S. Kinoshita. Efficient hash-chain based rfid<br />

privacy protection scheme. In International Conference on Ubiquitous Computing–Ubicomp,<br />

Workshop Privacy: Current Status and Future Directions, 2004.<br />

91


BIBLIOGRAPHY<br />

[Oliveira et al., 2008] Leonardo B. Oliveira, Michael Scott, J ˇ d˙z˝lio Lopez, , and Ricardo Dahab.<br />

TinyPBC: Pairings for Authenticated Identity-Based Non-Interactive Key Distribution in Sensor<br />

Networks. In Proceedings of the 5th International Conference on Networked Sensing Systems<br />

(INSS’08), pages 173–179, Kanazawa/Japan, June 2008. IEEE, IEEE.<br />

[Peris-Lopez et al., 2006] P. Peris-Lopez, J. Hernandez-Castro, J. Estevez-Tapiador, and A. Ribagorda.<br />

Rfid systems: A survey on security threats and proposed solutions. In Personal Wireless<br />

Communications, pages 159–170. Springer, 2006.<br />

[Perrig et al., 2002] Adrian Perrig, Ran Canetti, J. ˜ D. Tygar, and Dawn Song. The TESLA Broadcast<br />

Authentication Protocol. RSA CryptoBytes, 5(Summer), 2002.<br />

[Perrig et al., 2004] A. Perrig, J. Stankovic, and D. Wagner. Security in wireless sensor networks.<br />

Communications of the ACM, 47(6):53–57, 2004.<br />

[Pfitzmann and Köhntopp, 2001] A. Pfitzmann and M. Köhntopp. Anonymity, unobservability,<br />

and pseudonymity–a proposal for terminology. In Designing privacy enhancing technologies,<br />

pages 1–9. Springer, 2001.<br />

[Piotrowski et al., 2006] K. Piotrowski, P. Langendoerfer, and S. Peter. How public key cryptography<br />

influences wireless sensor node lifetime. In Proceedings of the fourth ACM workshop on<br />

Security of ad hoc and sensor networks, pages 169–176, Nov. 2006.<br />

[Preneel and Oorschot, 1999] B. Preneel and Van Oorschot. On the security of iterated message<br />

authentication codes. IEEE Transactions on Information theory, 45(1):188–199, 1999.<br />

[Prim, 1957] R.C. Prim. Shortest connection networks and some generalizations. Bell system<br />

technical journal, 36(6):1389–1401, 1957.<br />

[Rajendran and Sreenaath, 2008] T. Rajendran and K. V. Sreenaath. Secure anonymous routing<br />

in ad hoc networks. In Proceedings of the 1st Bangalore Annual Computer Conference. ACM<br />

New York, 2008.<br />

[Rappaport, 2001] Theodore Rappaport. Wireless Communications: Principles and Practice.<br />

Prentice Hall PTR, Upper Saddle River, NJ, USA, 2nd edition, 2001.<br />

[Raya and Hubaux, 2005] M. Raya and J. P. Hubaux. The security of vehicular ad hoc networks.<br />

In Proc. of Third ACM Workshop on Security of Ad Hoc and Sensor Networks (SASN 2005).<br />

ACM, 2005.<br />

[Raya and Hubaux, 2007] M. Raya and J.P. Hubaux. Securing vehicular ad hoc networks. Journal<br />

of Computer Security, 15(1):39–68, 2007.<br />

[Reiter and Rubin, 1998] M.K. Reiter and A.D. Rubin. Crowds: Anonymity for web transactions.<br />

ACM Transactions on Information and System Security (TISSEC), 1(1):66–92, 1998.<br />

[rel, ] http://code.google.com/p/relic-toolkit/.<br />

[Ren et al., 2011] D. Ren, S. Du, and H. Zhu. A novel attack tree based risk assessment approach<br />

for location privacy preservation in the vanets. In Communications (ICC), 2011 IEEE<br />

International Conference on, pages 1–5. IEEE, 2011.<br />

[RFID, 2012] Wikipedia RFID. Radio-frequency identification. ”http://en.wikipedia.org/<br />

wiki/Radio-frequency_identification”, 2012.<br />

[Rieback et al., 2005] M. Rieback, B. Crispo, and A. Tanenbaum. Rfid guardian: A batterypowered<br />

mobile device for rfid privacy management. In Information Security and Privacy, pages<br />

259–273. Springer, 2005.<br />

92


Bibliography<br />

[Sampigethaya et al., 2005] K. Sampigethaya, L. Huang, M. Li, R. Poovendran, K. Matsuura, and<br />

K. Sezaki. Caravan: Providing location privacy for vanet. In in Embedded Security in Cars<br />

(ESCAR, 2005.<br />

[Sampigethaya et al., 2007] K. Sampigethaya, M. Li, L. Huang, and R. Poovendran. Amoeba:<br />

Robust location privacy scheme for vanet. IEEE Journal on Selected Areas in Communications,<br />

25(8):1569–1589, 2007.<br />

[Schnorr, 1991] C. P. Schnorr. Efficient signature generation by smart cards. Journal of Cryptology,<br />

4(3):161–174, 1991.<br />

[Schoch et al., 2006] E. Schoch, F. Kargl, T. Leinmüller, S. Schlott, and P. Papadimitratos. Impact<br />

of pseudonym changes on geographic routing in vanets. Security and Privacy in Ad-Hoc and<br />

Sensor Networks, pages 43–57, 2006.<br />

[Serjantov and Danezis, 2003] A. Serjantov and G. Danezis. Towards an information theoretic<br />

metric for anonymity. In Privacy Enhancing Technologies, pages 259–263. Springer, 2003.<br />

[Seys and Preneel, 2006] S. Seys and B. Preneel. ARM: Anonymous routing protocol for mobile<br />

ad hoc networks. In 20th International Conference on Advanced Information Networking and<br />

Applications, AINA, pages 133–137. IEEE, 2006.<br />

[Sharma et al., 2012] S. Sharma, A. Sahu, A. Verma, and N. Shukla. Wireless sensor network<br />

security. Advances in Computer Science and Information Technology. Computer Science and<br />

Information Technology, pages 317–326, 2012.<br />

[Sheng and Li, 2008] B. Sheng and Q. Li. Verifiable privacy-preserving range query in two-tiered<br />

sensor networks. In Proceedings of Infocom, pages 46–50. IEEE, 2008.<br />

[Sirivianos et al., 2007] M. Sirivianos, D. Westhoff, F. Armknecht, and J. Girao. Non-manipulable<br />

aggregator node election protocols for wireless sensor networks. In Modeling and Optimization<br />

in Mobile, Ad Hoc and Wireless Networks and Workshops, 2007. WiOpt 2007. 5th International<br />

Symposium on, pages 1–10. IEEE, 2007.<br />

[Studer et al., 2008] A. Studer, E. Shi, F. Bai, and A. Perrig. TACKing Together Efficient Authentication,<br />

Revocation, and Privacy in VANETs. Technical report, Carnegie Mellon CyLab,<br />

2008.<br />

[Syamsuddin et al., 2008] Irfan Syamsuddin, Tharam Dillon, Elizabeth Chang, and Song Han. A<br />

survey of RFID authentication protocols based on hash-chain method. In Convergence and<br />

Hybrid Information Technology – ICCIT’08, volume 2, pages 559–564. IEEE, 2008.<br />

[Szczechowiak et al., 2008] Piotr Szczechowiak, Leonardo B. Oliveira, Michael Scott, Martin Collier,<br />

and Ricardo Dahab. Nanoecc: Testing the limits of elliptic curve cryptography in sensor<br />

networks. In Proceedings of the European conference on Wireless Sensor Networks (EWSN’08),<br />

2008.<br />

[Tel, 2000] Gerard Tel. Introduction to Distributed Algorithms (2nd ed.). Cambridge University<br />

Press, 2000.<br />

[VSC, ] Vehicle safety communications project. ”http://www-nrd.nhtsa.dot.gov/pdf/nrd-12/<br />

CAMP3/pages/VSCC.htm/”.<br />

[Wagner, 2004] David Wagner. Resilient aggregation in sensor networks. In Proceedings of the 2nd<br />

ACM workshop on Security of ad hoc and sensor networks, SASN ’04, pages 78–87, New York,<br />

NY, USA, 2004. ACM.<br />

[Wan et al., 2002] C.Y. Wan, A.T. Campbell, and L. Krishnamurthy. PSFQ: a reliable transport<br />

protocol for wireless sensor networks. In Proceedings of the 1st ACM international workshop on<br />

Wireless sensor networks and applications, pages 1–11. ACM, 2002.<br />

93


BIBLIOGRAPHY<br />

[Wiedersheim et al., 2010] B. Wiedersheim, Z. Ma, F. Kargl, and P. Papadimitratos. Privacy in<br />

inter-vehicular networks: Why simple pseudonym change is not enough. In Wireless On-demand<br />

Network Systems and Services (WONS), 2010 Seventh International Conference on, pages 176–<br />

183. IEEE, 2010.<br />

[Willke et al., 2009] T.L. Willke, P. Tientrakool, and N.F. Maxemchuk. A survey of inter-vehicle<br />

communication protocols and their applications. Communications Surveys Tutorials, IEEE,<br />

11(2):3 –20, quarter 2009.<br />

[Wu et al., 2009] D.L. Wu, W.W.Y. Ng, D.S. Yeung, and H.L. Ding. A brief survey on current rfid<br />

applications. In Machine Learning and Cybernetics, 2009 International Conference on, volume 4,<br />

pages 2330–2335. IEEE, 2009.<br />

[Xi et al., 2006] Y. Xi, L. Schwiebert, and W. Shi. Preserving source location privacy in<br />

monitoring-based wireless sensor networks. In Parallel and Distributed Processing Symposium,<br />

2006. IPDPS 2006. 20th International, pages 8–pp. IEEE, 2006.<br />

[Xiong et al., 2010] Xiaokang Xiong, Duncan S. Wong, and Xiaotie Deng. TinyPairing: A Fast and<br />

Lightweight Pairing-based Cryptographic Library for Wireless Sensor Networks. In Proceedings<br />

of the IEEE Wireless Communications & Networking Conference. IEEE, 2010.<br />

[Yick et al., 2008] J. Yick, B. Mukherjee, and D. Ghosal. Wireless sensor network survey. Computer<br />

networks, 52(12):2292–2330, 2008.<br />

[Zhang et al., 2006] Y. Zhang, W. Liu, W. Lou, and Y. Fang. Mask: Anonymous on-demand<br />

routing in mobile ad hoc networks. IEEE Transactions on Wireless Communications, 5(9):2376–<br />

2385, 2006.<br />

[Zhang et al., 2008] W. Zhang, C. Wang, and T. Feng. Gpˆ 2s: Generic privacy-preservation<br />

solutions for approximate aggregation of sensor data (concise contribution). In Pervasive Computing<br />

and Communications, 2008. PerCom 2008. Sixth Annual IEEE International Conference<br />

on, pages 179–184. IEEE, 2008.<br />

[Zhu et al., 2003] Sencun Zhu, Sanjeev Setia, and Sushil Jajodia. Leap: Efficient security mechanisms<br />

for large-scale distributed sensor networks. In Proceedings of the 10th ACM conference<br />

on Computer and communications security, pages 62–72. ACM Press, 2003.<br />

94

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!