15.08.2013 Views

Privacy-Preserving Multi-Dimensional Range Search over ...

Privacy-Preserving Multi-Dimensional Range Search over ...

Privacy-Preserving Multi-Dimensional Range Search over ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Privacy</strong>-<strong>Preserving</strong> <strong>Multi</strong>-<strong>Dimensional</strong> <strong>Range</strong> <strong>Search</strong><br />

<strong>over</strong> Encrypted Cloud Data within Sublinear Time<br />

Boyang Wang †,†† , Yantian Hou † , Ming Li † , Haitao Wang † and Hui Li ††<br />

† Department of Computer Science, Utah State University, Logan, UT, USA<br />

†† State Key Laboratory of Integrated Service Networks, Xidian University, Xi’an, Shaanxi, China<br />

Abstract—Cloud computing promises users massive scale outsourced<br />

data storage services with much lower costs than traditional<br />

methods. However, privacy concerns compel sensitive data<br />

to be stored in the cloud under an encrypted form. This posts<br />

a great challenge for effectively utilizing the outsourced data,<br />

such as executing common SQL queries. A variety of searchable<br />

encryption techniques have been proposed to solve this issue;<br />

yet efficiency and scalability are still the two main obstacles<br />

for their adoptions in real-world datasets, which are multidimensional<br />

in general. In this paper, we propose the first sublinear<br />

time and privacy-preserving searchable encryption scheme<br />

for multi-dimensional range queries <strong>over</strong> encrypted databases.<br />

By designing and leveraging two novel predicate encryption<br />

primitives, our solution efficiently indexes and searches <strong>over</strong> the<br />

encrypted database using multi-dimensional data structures (e.g.,<br />

kd-trees), thus achieving sublinear time complexity. In addition,<br />

our methodology is also general as different data structures<br />

(e.g., range trees) can be exploited to yield different tradeoffs<br />

among privacy, storage <strong>over</strong>head and search time (as low as<br />

polylogarithmic). Our scheme is provably secure, and through<br />

extensive experimental evaluation on a large-scale real-world<br />

dataset, we show the efficiency and scalability of our scheme.<br />

I. INTRODUCTION<br />

With the adoption of cloud computing, data owners can reap<br />

huge economic benefits by outsourcing their data to the cloud<br />

instead of storing their data locally. Due to the serious privacy<br />

concerns in the cloud, sensitive data, such as financial or<br />

medical records, should be encrypted before being outsourced<br />

to the cloud server [1]. As a result, the cloud server or any<br />

other unauthorized party is not allowed to reveal the content<br />

of outsourced data unless it possesses proper decryption keys.<br />

Unfortunately, the encryption on outsourced data inevitably<br />

introduces new challenges to data utilization. Especially, how<br />

to enable data users to effectively search <strong>over</strong> encrypted data as<br />

they do in the plaintext domain, is one of the most significant<br />

issues needed to be solved in the cloud.<br />

To enable different search functions <strong>over</strong> encrypted data,<br />

many searchable encryption (SE) schemes have been proposed<br />

[2]–[7]. Besides protecting the data content as in traditional<br />

encryption schemes, SE schemes should not only hide the<br />

attributes (e.g. keywords) of encrypted data records, but also<br />

protect the content of search queries. However, most of the existing<br />

SE schemes are only able to handle very simple queries,<br />

such as single keyword search [3] or single-dimensional<br />

range queries [6], which are not able to well support multidimensional<br />

range queries <strong>over</strong> encrypted data. Considering<br />

real-world datasets are often multi-dimensional while common<br />

SQL queries (such as equality and comparison) can all be<br />

flexibly expressed as range queries [8], we believe designing<br />

an efficient and privacy-preserving <strong>Multi</strong>-<strong>Dimensional</strong> <strong>Range</strong><br />

<strong>Search</strong> <strong>over</strong> Encrypted data (MDRSE) will be an important<br />

step to make searchable encryption practical on large datasets<br />

in a real-world cloud setting.<br />

Shi et al. [8] first designed an SE scheme to handle multidimensional<br />

range queries <strong>over</strong> encrypted data, which can<br />

be used in many applications, such as network audit logs<br />

and financial databases [8]. Unfortunately, the search time<br />

of its straightforward application to large datasets is linearly<br />

increasing with regard to the total number of data records.<br />

Since the size of a real-world dataset is generally very large<br />

(e.g. in the order of millions or larger), this scheme is clearly<br />

not practical especially under the cloud setting.<br />

More recently, Lu [9] designed a symmetric key based<br />

single dimensional range search scheme (named LSED) on<br />

encrypted data within logarithmic search time, and mentioned<br />

a direct extension of it (denoted as LSED + ) into the multidimension<br />

by achieving sublinear search time. However, this<br />

direct extension, which decomposes multi-dimensional search<br />

and decryption abilities (search tokens and decryption keys)<br />

into the ones in each single dimension, will result in significant<br />

privacy leaks [9].<br />

Salary<br />

2,000<br />

1,000<br />

G<br />

20 30<br />

Age<br />

G The Granted <strong>Search</strong> and Decryption Ability<br />

<strong>Privacy</strong> Leak on One <strong>Range</strong> Query<br />

(a) <strong>Privacy</strong> leak on one query.<br />

Salary<br />

3,000<br />

2,500<br />

2,000<br />

1,000<br />

G<br />

G<br />

20 30 45 50<br />

Age<br />

G The Granted <strong>Search</strong> and Decryption Abilities<br />

<strong>Privacy</strong> Leak Caused by Collusion<br />

(b) <strong>Privacy</strong> leak caused by collusion.<br />

Fig. 1. Examples of privacy leakage in LSED + [9].<br />

Specifically, given the search and decryption ability of<br />

a two-dimensional range query Q = (age ∈ [20,30]) ∧<br />

(salary ∈ [1,000,2,000]) in LSED + , a data user is able to<br />

not only search but also decrypt on a single dimensional range<br />

query Qx = (age ∈ [20,30]) on X-dimension or Qy =<br />

(salary ∈ [1,000,2,000]) on Y-dimension independently,<br />

which clearly reveals more private information to this data user<br />

than he is allowed to (as described in Fig. 1(a)). The leakage of<br />

these privacy, referred to as single-dimensional query privacy<br />

(search) and single-dimensional confidentiality (decryption)<br />

respectively, will be dramatically aggravated when the number<br />

of dimensions of a range query increases. More<strong>over</strong>, failing to<br />

preserve query privacy or confidentiality in single dimension


TABLE I<br />

COMPARISON AMONG DIFFERENT SOLUTIONS.<br />

Solution Storage Cost <strong>Search</strong> Time Single-Dim. Confidentiality/Query <strong>Privacy</strong> Public Key Based<br />

Straightforward [8] O(n·DlogT) O(n·DlogT) Yes / Yes Yes<br />

LSED + [9] O(n·DlogT) O(n 1−1/D ·Dlog 2 T) No / No No<br />

Ours (KD-Trees) O(n·DlogT) O(n 1−1/D ·DlogT) Yes / Yes Yes<br />

Ours (<strong>Range</strong> Trees) O(nlog D−1 n·DlogT) O(log D−1 n·DlogT) Yes / No Yes<br />

will inevitably introduce collusion among different queries [8].<br />

As shown in Fig. 1(b), the data user is able to search and<br />

decrypt the exact information on a multi-dimensional range<br />

query Q ′ = (age ∈ [20,30]) ∧ (salary ∈ [2,500,3,000])<br />

without the permission of the data owner.<br />

On the other hand, due to the leak of single dimensional<br />

query privacy in LSED + , the cloud server may easily deduce<br />

the content of a multi-dimensional range query with some<br />

advanced background information of outsourced data. For<br />

instance, if the cloud server can decompose the search token of<br />

multi-dimensional range query Q into each single dimension<br />

while some single-dimensional statistics (e.g., the percentage<br />

of people in their twenties within the whole population is<br />

20% or people’s salary between [1,000,2,000] is 40%) are<br />

available, then it is able to easily infer the content of this<br />

range query in each single dimension respectively based on<br />

the respective percentage of matched records in the database.<br />

Therefore, how to design an efficient and privacy-preserving<br />

MDRSE remains open.<br />

In this paper, we endeavor to solve this problem by proposing<br />

a privacy-preserving <strong>Multi</strong>-<strong>Dimensional</strong> <strong>Range</strong> <strong>Search</strong><br />

<strong>over</strong> Encrypted data (MDRSE) within sublinear time in the<br />

cloud. Obviously, with multi-dimensional data structures, such<br />

as kd-trees [10], we can easily achieve sublinear search time in<br />

the plaintext domain with existing methods [11]. However, the<br />

main challenge of utilizing these multi-dimensional data<br />

structures in this paper is how to exactly follow the search<br />

algorithm of a multi-dimensional data structure in the<br />

plaintext domain while all the nodes (including all the data<br />

records) in the structure and range queries are encrypted.<br />

Meanwhile, we also need to preserve single dimensional<br />

confidentiality and single dimensional query privacy. The<br />

main contributions of our paper are:<br />

(1) We first present two new predicate encryptions, named<br />

hyper-rectangle intersection predicate encryption and hyperrectangle<br />

contained predicate encryption respectively, by using<br />

some existing cryptographic primitive [8] in a novel way.<br />

With these predicate encryptions, we can verify the geometry<br />

relationships of two hyper-rectangles, including whether two<br />

hyper-rectangles intersect, and whether one hyper-rectangle is<br />

fully contained in another one.<br />

(2) With the properties of the two proposed predicate<br />

encryptions, we can build a sublinear time MDRSE, which<br />

can exactly follow the search algorithm of a kd-tree in the<br />

plaintext domain while all the nodes in the kd-tree and range<br />

queries are encrypted. More<strong>over</strong>, the security analyses show<br />

that our scheme is selectively secure and privacy-preserving.<br />

(3) With the similar design methodology, we can also<br />

construct another MDRSE based on range trees [12] to achieve<br />

a more efficient (polylogarithmic) search time <strong>over</strong> encrypted<br />

data. As a necessary tradeoff, it requires much more storage<br />

<strong>over</strong>head than the one based on kd-trees. In addition, it<br />

inevitably losses single dimensional query privacy, which is<br />

caused by the structure of a range tree itself. However, single<br />

dimensional confidentiality is still preserved.<br />

(4) Using a large-scale real-world dataset, we evaluate<br />

the performance of our schemes in experiments with up to<br />

millions of data records. Compared to previous solutions, our<br />

schemes show much higher scalability and efficiency <strong>over</strong><br />

large encrypted datasets.<br />

The comparison among different solutions is presented in<br />

Table I, where n denotes the number of data records in a<br />

database, D is the number of dimensions and T is the size<br />

of each dimension. Our schemes are within sublinear search<br />

time with respect to n.<br />

A. Framework for MDRSE<br />

II. PROBLEM DEFINITION<br />

In this paper, we focus on realizing sublinear time and<br />

privacy-preserving <strong>Multi</strong>-<strong>Dimensional</strong> <strong>Range</strong> <strong>Search</strong> <strong>over</strong> Encrypted<br />

Data (MDRSE). To achieve efficient (faster than linear)<br />

search time, a multi-dimensional data structure Γ should<br />

be used. Without loss of generality, we assume the entire range<br />

of the i-th dimension is [1,Ti], where Ti is a power of 2. We<br />

present the definition of MDRSE as below.<br />

Definition 1: (MDRSE). An MDRSE scheme includes the<br />

following six probabilistic polynomial time algorithms.<br />

• Setup(λ, {T1,...,TD}): Given security parameter λ and<br />

upper bound Ti in each dimension, output public key PK<br />

and master key MK.<br />

• Build({M1,...,Mn}, {A1,...,An}): Given all the data<br />

records {M1,...,Mn} and their attributes {A1,...,An},<br />

output data structure Γ.<br />

• Encrypt(PK, Mi, Ai, Γ): Given public key PK, each<br />

data record Mi, its attribute vector Ai = (ai,1,...,ai,D)<br />

and data structure Γ, output an encrypted data record Ci,<br />

and return outsourced data structure Γ ∗ based on all the<br />

encrypted data record {C1,...,Cn}.<br />

• Extract(MK, Q): Given master key MK and multidimensional<br />

range search query Q, output search token<br />

TKQ and decryption key SKQ.<br />

• <strong>Search</strong>(TKQ, Ci, Γ ∗ ): Given search token TKQ, encrypted<br />

data record Ci and outsourced data structure<br />

Γ ∗ , where Ci ←Encrypt(PK, Mi, Ai, Γ), output 1 if<br />

Ai ∈ Q; otherwise, output 0.<br />

• Decrypt(DKQ, Ci): Given decryption key SKQ and<br />

encrypted data record Ci, where Ci ←Encrypt(PK, Mi,<br />

Ai, Γ), output data record Mi if Ai ∈ Q; otherwise, ⊥.<br />

In the cloud scenario which we described in Fig. 2, there<br />

are three entities, including one or multiple data owners, data


Data Owner<br />

1. Request<br />

2. <strong>Search</strong> Token & Decryption Key<br />

Data User<br />

Encrypted Data<br />

3. <strong>Search</strong> Token<br />

4. Matched Encrypted Data<br />

Cloud Server<br />

Fig. 2. The system model includes one or multiple data owners, data users,<br />

and the cloud server.<br />

users and the cloud server. Compared to the case of a single<br />

data owner, an extra trusted party is needed to manage the<br />

master key for the case of multiple owners. Setup is run by<br />

the data owner (or a trusted party in the multi-owner case). The<br />

data owner builds a multi-dimensional data structure Γ with<br />

Build. Then, the owner encrypts every data record Mi based<br />

on its attribute vector Ai = (ai,1,...,ai,n) (an attributeai,j can<br />

be regraded as a value of a field/dimension in the database,<br />

such as ”age=30” or ”salary=2,000”) with public key PK using<br />

Encrypt, and outsources ciphertexts C = (C1,...,Cn) and<br />

outsourced data structure Γ ∗ to the cloud. This outsourced data<br />

structureΓ ∗ has the same structure asΓ, but all the nodes inΓ ∗<br />

are encrypted. The data owner (or a trusted party in the multiowner<br />

case) generates search token TKQ and decryption key<br />

SKQ for a data user based on a multi-dimensional range search<br />

query Q with Extract. In <strong>Search</strong>, a data user submits<br />

search token TKQ as the encrypted version of search query Q<br />

to the cloud server, and the cloud server follows outsourced<br />

data structure Γ ∗ and returns encrypted data record Ci to<br />

this data user, if 1 ←<strong>Search</strong>(TKQ, Ci, Γ ∗ ). After retrieving<br />

all matched records from the cloud server, the data user can<br />

decrypt each one using decryption key DKQ in Decrypt.<br />

In practice, to improve the efficiency of encryption <strong>over</strong><br />

large-size data records (e.g., e-health records), we can use an<br />

extra layer of symmetric key encryption to encrypt each data<br />

record first, and encrypt each symmetric key with MDRSE.<br />

B. Threat Model<br />

In this paper, we assume the data owner herself is a secure<br />

and trusted party. The cloud server is assumed as a semitrusted<br />

party (a.k.a. honest-but-curious party [5], [13]), which<br />

means the data owner believes the cloud server is able to<br />

provide reliable data storage and search services (will return<br />

the correct search results by following the protocols), but may<br />

be curious about not only the content of data outsourced by<br />

the owner but also the content of search queries submitted by<br />

data users. We also assume that data users are curious — a<br />

data user may try to search and decrypt more data than allowed<br />

with his given capability, or even collude with the cloud server<br />

to reveal more privacy information about the owner’s data.<br />

C. Security Definition and Design Objectives<br />

Similar to previous public-key based searchable encryption<br />

schemes, our MDRSE should achieve selective security [14].<br />

The security game between a challenger and an adversary with<br />

selective security can be described as follows.<br />

Definition 2: (MDRSE-selective security). An MDRSE<br />

scheme is selectively secure if all polynomial time adversaries<br />

have at most a negligible advantageǫin the following selective<br />

security game.<br />

• Setup: The challenger runs algorithmSetup(λ, {T1,...,<br />

TD}) to generate public key PK and master key MK. It<br />

keeps master key MK private and gives public key PK to<br />

the adversary.<br />

• Phase 1: The adversary adaptively issues p search token<br />

and decryption key requests to the challenger for<br />

multi-dimensional range queries Q1,...,Qp, and obtains<br />

p search tokens and corresponding p decryption keys.<br />

• Challenge: The adversary submits two equal length data<br />

records M∗ 0 and M∗ 1 , and their corresponding attribute<br />

vectors A∗ 0 and A∗ 1, where A∗ 0 /∈ Qi and A∗ 1 /∈ Qi,<br />

∀i ∈ [1,p]. The challenger flips a coin, the random<br />

result of the coin flipping is b ∈ {0,1}. Then, the<br />

challenger encrypts data record M∗ b by running algorithm<br />

Encrypt(PK, M∗ b , A∗ b , Γ), and returns encrypted data<br />

record C∗ b to the adversary.<br />

• Phase 2: The adversary continues to issue another q<br />

search token and decryption key requests to the challenger<br />

for multi-dimensional range queries Qp+1,...,Qp+q, and<br />

obtains q search tokens and corresponding q decryption<br />

keys, where A∗ 0 /∈ Qi and A∗ 1 /∈ Qi, ∀i ∈ [p+1,p+q].<br />

• Guess: Eventually, the adversary outputs a guess b ′ of b.<br />

An adversary A’s advantage in the above game is defines as<br />

Adv MDRSE−SS<br />

A (λ) = |Pr[b = b ′ ]− 1<br />

| ≤ ǫ. (1)<br />

2<br />

In the above security game, an adversary is allowed to obtain<br />

search tokenTKQ and decryption keySKQ for any search query<br />

Q of his choice, however, it is not able to distinguish encrypted<br />

data record C∗ 0 from encrypted data record C∗ 1 , where none of<br />

search query chosen by this adversary contains attribute vector<br />

A∗ 0 or A∗ 1.<br />

Besides the above security requirement, the main design<br />

objectives of our MDRSE focus on the following three aspects:<br />

• Sublinear <strong>Search</strong> Time. The search with a multidimensional<br />

range search query on the entire encrypted<br />

cloud data should be finished within sublinear time with<br />

respect to n, the total number of the data records.<br />

• Single <strong>Dimensional</strong> Confidentiality. The decryption<br />

ability of a data user on a multi-dimensional range query<br />

should not render him the additional capability to decrypt<br />

on each single dimension independently.<br />

• Single <strong>Dimensional</strong> Query <strong>Privacy</strong>. The search ability<br />

of a data user on a multi-dimensional range query should<br />

not grant him the additional capability to search on each<br />

single dimension independently.<br />

<strong>Preserving</strong> single dimensional confidentiality will resist decryption<br />

collusion among queries. Similarly, protecting single<br />

query privacy is able to prevent search collusion among<br />

queries. In this paper, we do not aim at providing the data<br />

access pattern privacy, which is an orthogonal problem [15].


L2<br />

L8<br />

L5<br />

P4 P9<br />

P5<br />

P1<br />

P2<br />

L4<br />

P3<br />

L1<br />

P7<br />

L9<br />

L7<br />

P10<br />

L6<br />

P6<br />

P8<br />

L3<br />

P1<br />

L8<br />

P2<br />

L1<br />

L2 L3<br />

L4 L5<br />

L6 L7<br />

P3 P4 P5<br />

L9<br />

P6 P7<br />

P8 P9 P10<br />

Nodes are not visited:<br />

L5 L8 P1 P2 P4 P5<br />

Nodes are visited:<br />

L1 L2 L3 L4 L6 L7 P3 P8 P9 P10<br />

Nodes are reported directly:<br />

L9 P6 P7<br />

Points are reported (points lie in the query):<br />

P3 P6 P7<br />

(a) Points {P1,...,P10} in two- (b) The kd-tree based on Points {P1,...,P10}, and the search on it based on a query rectangle R, where<br />

dimension, and a query rectangle R point P3, P6, and P7 are reported<br />

c<strong>over</strong>s point P3, P6, and P7<br />

III. PRELIMINARIES<br />

In this section, we introduce some multi-dimensional data<br />

structures and some definitions in the multi-dimensions.<br />

A. <strong>Multi</strong>-<strong>Dimensional</strong> Data Structures<br />

A kd-tree [10] is a type of multi-dimensional data structure,<br />

which can be used in many applications, such as range<br />

searches and nearest neighbor searches [11]. The basic idea<br />

to build a kd-tree is to split a set of (1-dimensional) points<br />

into two subsets of roughly equal size, one subset contains<br />

the points smaller than or equal to the splitting value, the<br />

other subset contains the points larger than the splitting value.<br />

The splitting value is stored at the root, and the two subsets<br />

are stored recursively in the two subtrees [11]. For the case<br />

of two-dimensional points, the splitting is first performed on<br />

X-dimension, next on Y-dimension, then on X-dimension, and<br />

so on. Further details about how to construct a kd-tree can be<br />

found in [11]. An example of a kd-tree in the two-dimension<br />

is illustrated in Fig. 3. According to the construction, a kd-tree<br />

for a set of n points use O(n) storage and can be constructed<br />

in O(nlogn) time.<br />

L2<br />

L8<br />

L5<br />

P4 P9<br />

P5<br />

P1<br />

P2<br />

L4<br />

P3<br />

L1<br />

P7<br />

L9<br />

L7<br />

P10<br />

L6<br />

P6<br />

P8<br />

L3<br />

Fig. 3. An example of on a kd-tree, and a search query on it.<br />

L9<br />

P6 P7<br />

Fig. 4. Non-leaf node L3 represents two rectangles (the yellow rectangle<br />

and the red rectangle).<br />

As we can see from Fig. 4, a leaf node in a kd-tree<br />

represents a point, and a non-leaf node (a splitting value)<br />

represents two rectangles. Generally, a two-dimensional range<br />

query on a kd-tree can also be described as a rectangle [11].<br />

We observe that whether two rectangles intersect and<br />

whether a rectangle is fully contained in another one are<br />

two significant factors to make decisions at non-leaf nodes<br />

during the search algorithm on a kd-tree. For the general<br />

case in the multi-dimension, we use hyper-rectangles instead<br />

of rectangles. A search example and details of the search<br />

algorithm are presented in Fig. 3 and Algorithm 1. With<br />

L1<br />

L6<br />

P8<br />

L3<br />

P9<br />

L7<br />

P10<br />

kd-trees, a D-dimensional range query takes O(n 1−1/D +k)<br />

(sublinear) time, where k is the number of reported points.<br />

Algorithm 1: <strong>Search</strong>KdTree(v,R)<br />

Input: The root (or a node) v of a kd-tree, and a rectangle R.<br />

Output: All points at leaves below v that lie in R.<br />

1 if v is a leaf then<br />

2 Report the point stored at v if it lies in R.<br />

3 else if v is a non-leaf then<br />

4 if LeftRectangle(v) is fully contained in R then<br />

5 Report all the points in LeftRectangle(v);<br />

6 else if LeftRectangle(v) intersects R then<br />

7 <strong>Search</strong>KdTree(LeftChild(v), R);<br />

8 if RightRectangle(v) is fully contained in R then<br />

9 Report all the points in RightRectangle(v);<br />

10 else if RightRectangle(v) intersects R then<br />

11 <strong>Search</strong>KdTree(RightChild(v), R);<br />

Compared to a kd-tree, a range tree [11] is able to achieve an<br />

even faster search time with O(log D n + k). As a necessary<br />

tradeoff, it requires O(nlog D−1 n) storage, which is much<br />

more than the O(n) storage in a kd-tree. Details description<br />

about range trees can be found in [11].<br />

B. Definitions in the <strong>Multi</strong>-Dimension<br />

Here we introduce some definitions, including lattices,<br />

points and hyper-rectangles.<br />

Definition 3: (Lattice). Let ∆ = (T1,...,TD), where Ti<br />

is the upper bound in the i-th dimension and 1 ≤ i ≤ D.<br />

A lattice L∆ is defined as L∆ = [T1] × ··· × [TD], where<br />

[Ti] = {1,...,Ti}.<br />

Definition 4: (Point). A point X in L∆ is defined as X =<br />

(x1,...,xD), where xi ∈ [Ti], ∀i ∈ D.<br />

Definition 5: (Hyper-Rectangle). A hyper-rectangle HR<br />

in L∆ is defined as HR = (R1,...,RD), where Ri is a range<br />

in the i-th dimension, Ri ∈ [1,Ti], ∀i ∈ D.<br />

IV. MULTI-DIMENSIONAL RANGE SEARCH OVER<br />

ENCRYPTED DATA (MDRSE)<br />

In this section, we will first show how to design a straightforward<br />

MDRSE with some existing cryptographic primitive<br />

[8] but without the help of any data structure. Based on<br />

the properties of the primitive, this straightforward solution


can protect both single dimensional confidentiality and query<br />

privacy, but only with linear search time, which is not practical<br />

for searching on large datasets in the cloud. Then, we will describe<br />

how to build a sublinear time and privacy-preserving<br />

MDRSE with kd-trees and the same cryptographic primitive<br />

[8], which however, is leveraged in a novel way.<br />

A. Straightforward Solution<br />

By directly leveraging a recent primitive, named <strong>Multi</strong>dimensional<br />

<strong>Range</strong> Predicate Encryption (MRPE) [8], we can<br />

first design a straightforward solution of MDRSE. Specifically,<br />

the data owner can directly encrypt each data record using<br />

MRPE without any data structure, and outsource all the<br />

encrypted data records to the cloud server. Then, the cloud<br />

server is able to preform search on all the encrypted records<br />

one by one based on the search token submitted by a data user.<br />

In fact, this is the final solution in [8]. Clearly, the search time<br />

of this straightforward solution is linearly increasing with the<br />

total number of encrypted data records in the cloud server.<br />

Considering the number of data records stored in the cloud<br />

server could be quite large, this straightforward solution is<br />

not practical, especially in the cloud scenario.<br />

The MRPE mentioned above is a special type of encryption<br />

that, if a message is encrypted with a point X, then the<br />

encrypted version of this message can be correctly decrypted<br />

with a decryption key generated by a hyper-rectangle HR,<br />

if X ∈ HR. Details of MRPE are presented in Fig. 5. In<br />

the straightforward solution, each data record Mi can be seen<br />

as a message, its attribute vector Ai = (a1,i,...,aD,i) can be<br />

described as a point X, and a multi-dimensional range search<br />

query Q can be represented as a hyper-rectangle HR.<br />

This MRPE [8] is built on Anonymous Identity-Based<br />

Encryption (AIBE) [16] and binding techniques, which are<br />

able to prevent collusion attacks introduced by the use of<br />

the combination of different secret keys [8]. The ability of<br />

preventing collusion attacks with different secret keys enables<br />

the straightforward solution to preserve single dimensional<br />

confidentiality and query privacy. MRPE also has a predicateonly<br />

version, denoted as MRPE O , where only the flag string<br />

Flag is encrypted, and the final output of decryption is 1,<br />

if X ∈ HR and 0 otherwise. MRPE is selectively secure<br />

if all polynomial time adversaries have at most a negligible<br />

advantage ǫ in the selective security game. The security<br />

of MRPE is based on the Decision Bilinear Diffie-Hellman<br />

Assumption and Decision Linear Assumption [8]. Details of<br />

security analyses can be found in [8]. Actually, MRPE can be<br />

seen as a type of Functional Encryption [17], which is believed<br />

as a new vision for public key cryptography.<br />

B. Our Solution: Challenges and Overview<br />

In order to improve the search efficiency (faster than linear<br />

search time), we would like to exploit a kd-tree to organize<br />

all the data records and handle multi-dimensional range search<br />

queries by following the search algorithm of a kd-tree. Obviously,<br />

it is easy to implement with existing techniques if the<br />

data records are in plaintext. However, it is a challenging task<br />

to exactly follow the search algorithm of a kd-tree when all<br />

the nodes in the kd-tree and range queries are encrypted. More<br />

Definition 6: (MRPE). A <strong>Multi</strong>-<strong>Dimensional</strong> <strong>Range</strong> Predicate<br />

Encryption (MRPE) scheme should consist of the following four<br />

probabilistic polynomial time algorithms.<br />

Setup (λ, L∆): Given security parameter λ and lattice L∆ =<br />

[T1]×···×[TD], where [Ti] = {1,...,Ti}, output public key<br />

PK and master key MK.<br />

Encrypt (PK, X, M, Flag): Given public key PK, point X ∈<br />

L∆, message M and flag string Flag ∈ {0,1} α , where Flag<br />

is a constant string with α bits, output ciphertext C, where C<br />

is the encryption of Flag||M, || denotes the concatenation of<br />

the flag string and a message.<br />

DeriveKey (MK, HR): Given master key MK and hyperrectangle<br />

HR, where HR ∈ L∆, output secret key SKHR.<br />

Decrypt (PK, SKHR, C): Given public key PK, secret key<br />

SKHR and ciphertext C, output M if the first α bits of<br />

the decryption of C is Flag (X ∈ HR), and ⊥ otherwise<br />

(X /∈ HR). The above algorithms must satisfy the following<br />

constraints:<br />

Decrypt(PK,SKHR,C) =<br />

Flag||M, if X ∈ HR<br />

⊥, w.h.p., if X /∈ HR<br />

where C =Encrypt(PK,X,M,Flag)<br />

SKHR =DeriveKey(MK,HR)<br />

If X /∈ HR, the probability that the decryption result shows ⊥<br />

is 1−2 −α .<br />

Fig. 5. Details of MRPE.<br />

specifically, according to what we observed in Section III, we<br />

need to find methods to decide whether two hyper-rectangles<br />

intersect; and whether one hyper-rectangle is fully contained<br />

in another one based only on encrypted data.<br />

To the best of our knowledge, there is no Functional<br />

Encryption (also denoted as Predicate Encryption) from the<br />

current literature is able to be directly used to verify the<br />

two geometry relationships of two hyper-rectangles that we<br />

mentioned above while still preserving privacy in each single<br />

dimension. Interestingly, by utilizing MRPE [8] in a novel<br />

way, we can find solutions to decide these two geometry<br />

relationships of two hyper-rectangles on encrypted data. More<br />

importantly, we can also protect privacy and confidentiality in<br />

each single dimension.<br />

In fact, by doing so, we can actually build two new Predicate<br />

Encryptions based on MRPE. The first one is hyper-rectangle<br />

intersection predicate encryption, where if a message is<br />

encrypted with a hyper-rectangle HR and it can only be<br />

decrypted with a decryption key generated by another hyper-<br />

rectangle HR ′ , if HR ∩ HR ′<br />

= ∅. The second one is<br />

hyper-rectangle contained predicate encryption, where if<br />

a message is encrypted with a hyper-rectangle HR and it can<br />

only be decrypted with a decryption key generated by another<br />

hyper-rectangle HR ′ , if HR ⊆ HR ′ .<br />

C. Hyper-Rectangle Intersection Predicate Encryption<br />

We now explain how to build a hyper-rectangle intersection<br />

predicate encryption based on MRPE. For the ease of<br />

description, we start with two ranges in a single dimension.<br />

Given two ranges R and R ′ in a single dimension, we treat<br />

the first range R = [xl,xu] as a point in a two-dimension, and


transform the second range R ′ = [x ′ l ,x′ u] into a rectangle in a<br />

two-dimension, and then decide whether the two-dimensional<br />

point based on the first range lies in the two-dimensional<br />

rectangle generated from the second range. If the point is<br />

indeed in the rectangle, then the two ranges intersect with<br />

each other. Otherwise, they do not intersect. Examples of range<br />

intersection are described in Fig. 6. Based on this basic idea,<br />

we have the following lemma<br />

x1,l x1,u<br />

1<br />

x2,l x2,u x3,l x3,u<br />

x ′ l<br />

x ′ u<br />

x4,l x4,u<br />

<strong>Range</strong>s intersect with R ′ = [x ′ l ,x′ u]<br />

<strong>Range</strong>s do not intersect with R ′ = [x ′ l ,x′ u]<br />

Fig. 6. Examples of whether two ranges intersect.<br />

Lemma 1: For two ranges R = [xl,xu] and R ′ = [x ′ l ,x′ u],<br />

we have a point X = (x1,x2) = (xl,xu) based on range R<br />

and a rectangle HR ′ = (R ′ 1,R ′ 2) based on range R ′ , where<br />

R ′ 1 = [1,x ′ u] and R ′ 2 = [x ′ l ,T]. If X ∈ HR′ , then R∩R ′ =<br />

∅; otherwise, X /∈ HR ′ , then R∩R ′ = ∅.<br />

The correctness of this lemma can be explained as below<br />

R∩R ′ = ∅ ⇔<br />

xl ≤ x ′ u<br />

xu ≥ x ′ l<br />

⇔<br />

xl ∈ [1,x ′ u]<br />

xu ∈ [x ′ l ,T] ⇔ X ∈ HR′ .<br />

By extending Lemma 1, we can decide whether the intersection<br />

of two hyper-rectangles is null.<br />

Lemma 2: For two hyper-rectangles HR = (R1,...,RD)<br />

and HR ′ = (R ′ 1,...,R ′ D ), where Ri = [xi,l,xi,u] and R ′ i =<br />

[x ′ i,l ,x′ i,u ], for i ∈ [D], we have a point X = (x1,...,x2D) =<br />

(x1,l,x1,u,...,xD,l,xD,u) and a hyper-rectangle HR ′′ =<br />

(R ′′<br />

1,...,R ′′<br />

2D ), where R′′ 2i−1 = [1,x′ i,u ] and R′′ 2i = [x′ i,l ,T],<br />

for i ∈ [D]. If X ∈ HR ′′ , then HR∩HR ′ = ∅; otherwise,<br />

X /∈ HR ′′ , then HR∩HR ′ = ∅.<br />

Then, we can encrypt a message with this 2D-dimensional<br />

point (transformed from HR) using MRPE, and this message<br />

can only be decrypted with a decryption key generated by a<br />

2D-dimensional hyper-rectangle (transformed from HR ′ ), if<br />

HR ∩ HR ′ = ∅. Clearly, this hyper-rectangle intersection<br />

predicate encryption is selectively secure if MRPE is selectively<br />

secure.<br />

D. Hyper-Rectangle Contained Predicate Encryption<br />

Similar as the method in the last subsection, we can decide<br />

whether a hyper-rectangle is fully contained in another one.<br />

For the ease of understanding, we still start with two ranges<br />

in the single dimension. Given two ranges R and R ′ , we<br />

transform the first range as a two-dimensional point, and<br />

transform the second one as a two-dimensional rectangle. If<br />

the point is in the rectangle, then range R is fully contained<br />

in range R ′ . Compared to the method of deciding whether<br />

two ranges intersect, the second range R ′ in this method is<br />

transformed in a different way, which is illustrated in the<br />

following lemma and Fig. 7.<br />

T<br />

x1,l<br />

x ′ l<br />

x1,u<br />

x2,lx2,u x3,lx3,u<br />

1 T<br />

x ′ u<br />

x4,l x4,u<br />

<strong>Range</strong>s are fully contained in R ′ = [x ′ l ,x′ u]<br />

<strong>Range</strong>s are not fully contained in R ′ = [x ′ l ,x′ u]<br />

Fig. 7. Examples of whether a range is fully contained in another one.<br />

Lemma 3: For two ranges R = [xl,xu] and R ′ = [x ′ l ,x′ u],<br />

we have a point X = (x1,x2) = (xl,xu) based on range R<br />

and a rectangle HR ′ = (R ′ 1,R ′ 2) based on range R ′ , where<br />

R ′ 1 = [x ′ l ,x′ u] and R ′ 2 = [x ′ l ,x′ u]. If X ∈ HR ′ , then R ⊆ R ′ ;<br />

otherwise, X /∈ HR ′ , then R R ′ .<br />

The correctness of this lemma can be proved as below<br />

R ⊆ R ′ ⇔<br />

xl ∈ [x ′ l ,x′ u]<br />

xu ∈ [x ′ l ,x′ u] ⇔ X ∈ HR′ .<br />

Similarly, we can also extend the above lemma into the multidimensional<br />

version.<br />

Lemma 4: For two hyper-rectangles HR = (R1,...,RD)<br />

and HR ′ = (R ′ 1,...,R ′ D ), where Ri = [xi,l,xi,u] and R ′ i =<br />

[x ′ i,l ,x′ i,u ], for i ∈ [D], we have a point X = (x1,...,x2D) =<br />

(x1,l,x1,u,...,xD,l,xD,u) and a hyper-rectangle HR ′′ =<br />

(R ′′<br />

1,...,R ′′<br />

2D ), where R′′ 2i−1 = [x′ i,l ,x′ i,u<br />

] and R′′<br />

2i =<br />

[x ′ i,l ,x′ i,u ], for i ∈ [D]. If X ∈ HR′′ , then HR ⊆ HR ′ ;<br />

otherwise, X /∈ HR ′′ , then HR HR ′ .<br />

Then, we can encrypt a message with this 2D-dimensional<br />

point (transformed from HR) using MRPE, and this message<br />

can only be decrypted with a decryption key generated by<br />

a 2D-dimensional hyper-rectangle (transformed from HR ′ ),<br />

if HR ⊆ HR ′ . Similarly, this hyper-rectangle contained<br />

predicate encryption is selectively secure as long as MRPE<br />

is selectively secure.<br />

E. Construction of Our Scheme<br />

With the two novel predicate encryptions above, we can<br />

build a sublinear time and privacy-preserving MDRSE by<br />

following the framework in Section II. Details of our scheme<br />

based on kd-trees are illustrated in Fig. 8.<br />

Specifically, the data owner first generates the public key<br />

and master key in Setup, and builds a kd-tree Γ based<br />

all the data records and attribute vectors in Build. Then,<br />

with Encrypt, the data owner encrypts every node in Γ,<br />

and outputs outsourced kd-tree Γ ∗ , which has exactly the<br />

same structure as Γ. A non-leaf node is encrypted with<br />

the combination of hyper-rectangle contained predicate encryption<br />

(e.g., {C3,L,C3,R}) and hyper-rectangle intersection<br />

predicate encryption (e.g., {C4,L,C4,R}), and returned as<br />

C = {C3,L,C3,R,C4,L,C4,R}. Note that only predicate-only<br />

version of the two encryptions are required for a non-leaf node,<br />

because a non-leaf node does not store any data records. A<br />

leaf node is encrypted with the combination of symmetric key<br />

encryption, MRPE and its predicate-only version, and returned<br />

as Ci = {Ci,0,Ci,1,Ci,2}.


A data user is given the search token and decryption key<br />

from the data owner based on the master key and search<br />

query inExtract, and submits the search token to the cloud<br />

server to operate <strong>Search</strong>. At a non-leaf node, if hL = 1<br />

(the left hyper-rectangle is fully contained in the query hyperrectangle),<br />

then the cloud server reports all the nodes in the left<br />

subtree; else if gL = 1 (the two hyper-rectangle intersect), the<br />

cloud server continues to search in the left subtree; otherwise,<br />

the cloud server stops to search the left subtree; Meanwhile,<br />

the cloud server also computes hR and gR, and make decisions<br />

for the right subtree with the same rule as described in Fig. 8.<br />

At a leaf node, if f = 1 (a point lies in the query hyperrectangle),<br />

the cloud server returns the encrypted data record<br />

on this leaf node to the data user.<br />

Compared to the search algorithm (described in Section III)<br />

of a kd-tree in the plaintext domain, our search algorithm <strong>over</strong><br />

encrypted data exactly follows the same search rules not only<br />

at non-leaf nodes but also at leaf nodes, which enables our<br />

scheme to achieve sublinear search time. Finally, the data user<br />

is able to reveal a data record Mi with a two-layer decryption<br />

if its attribute vector Ai ∈ Q in Decrypt. According to the<br />

correctness of decryption, even a data user colludes with the<br />

cloud server, and obtains more encrypted data records than he<br />

is expected, he cannot reveal those data record, whose Ai /∈ Q.<br />

F. Security and <strong>Privacy</strong> Analysis<br />

We now analyze the security and privacy of our scheme.<br />

Theorem 1: (Our Scheme-selective security). Our proposed<br />

scheme is selectively secure, as long as MRPE is<br />

selectively secure.<br />

Proof: The security of our scheme includes two aspects:<br />

the security on non-leaf nodes and security on leaf nodes.<br />

In our scheme, each non-leaf node can be represented<br />

as two hyper-rectangles, and the two hyperrectangle<br />

are encrypted with MRPEO separately. Without<br />

loss of generality, for two different encrypted values<br />

C∗ 1 = {C∗ 1,3,L ,C∗ 1,3,R ,C∗ 1,4,L ,C∗ 1,4,R } and C∗ 2 =<br />

{C∗ 2,3,L ,C∗ 2,3,R ,C∗ 2,4,L ,C∗ 2,4,R } on two different non-leaf<br />

nodes returned from the selective security game defined in<br />

Section II, if the advantage of an adversaryAon distinguishing<br />

two different ciphertexts C∗ 1,3,L and C∗ 2,3,L (or C∗ 1,3,R and<br />

C∗ 2,3,R ) in the MRPEO selectively secure game is ǫ, then<br />

the advantage of this adversary A on distinguishing the two<br />

different encrypted values C∗ 1 and C∗ 2 on two different nonleaf<br />

nodes is4ǫ−6ǫ2 +4ǫ3−ǫ4 .The advantage of this adversary<br />

A on non-leaf nodes in our scheme’s selective security game<br />

is defined as<br />

Adv OurScheme−SS<br />

A−Non−Leaf (λ) = |Pr[b = b′ ]− 1<br />

2 |<br />

≤ 4ǫ−6ǫ 2 +4ǫ 3 −ǫ 4 .<br />

which is negligible.<br />

Each leaf node is encrypted with one round MRPE and<br />

one round MRPE O . Without loss of generality, for two different<br />

encrypted values C ∗ 1 = {C ∗ 1,0,C ∗ 1,1,,C ∗ 1,2} and C ∗ 2 =<br />

{C ∗ 2,0,C ∗ 2,1,C ∗ 2,2} on two different leaf nodes (data records)<br />

returned from the selective security game defined in Section<br />

II, if the advantage of an adversary A on distinguishing<br />

two different ciphertexts C ∗ 1,1 and C ∗ 2,1 in the MRPE selectively<br />

secure game is ǫ, and the advantage on distinguishing<br />

two different ciphertexts C ∗ 1,2 and C ∗ 2,2 in the corresponding<br />

MRPE O selectively secure game is ǫ, then the advantage of<br />

this adversary A on distinguishing two different encrypted<br />

values C ∗ 1 and C ∗ 2 on two different leaf nodes (data records)<br />

is 2ǫ−ǫ 2 .<br />

The advantage of this adversary A on leaf nodes in our<br />

scheme’s selective security game is defined as<br />

Adv OurScheme−SS<br />

A−Leaf (λ) = |Pr[b = b ′ ]− 1<br />

2 | ≤ 2ǫ−ǫ2 . (2)<br />

which is negligible.<br />

Note that since the length of ciphertexts on a non-leaf<br />

node and a leaf node are different, an adversary is able<br />

to distinguish a non-leaf node from a leaf node based on<br />

ciphertexts. However, distinguishing whether a node is a leaf<br />

node or non-leaf node, the adversary will not gain more<br />

more advantage in the selective security game. Therefore, our<br />

proposed scheme is selectively secure, as long as MRPE is<br />

selectively secure.<br />

Theorem 2: (Single <strong>Dimensional</strong> Confidentiality). If our<br />

scheme is selectively secure, then it preserves single dimensional<br />

confidentiality.<br />

Proof: We first assume an adversary is able to decrypt<br />

data records on a single-dimensional range search query independently<br />

based on a decryption key on a multi-dimensional<br />

range search query, which means our scheme would fail to<br />

achieve multi-dimensional confidentiality. Then we will show<br />

that this assumption would conflict the fact that our scheme<br />

is selectively secure.<br />

More specifically, given a multi-dimensional range search<br />

query Q = [x1,l,x1,u] ∧ [x2,l,x2,u] ∧ ···∧[xD,l,xD,u], the<br />

data owner generates a decryption key SK SKQ on this query Q<br />

for the adversary. This search query Q can be represented as a<br />

hyper-rectangle HRQ = {R1,R2,...,RD}, Ri = [xi,l,xi,u],<br />

for 1 ≤ i ≤ D.<br />

We first assume the adversary is able to decrypt with SK SKQ<br />

on a single-dimensional range search query Q ′ = [x1,l,x1,u],<br />

which can be represented as HRQ ′ = {R1,R ′ 2,...,R ′ D },<br />

R1 = [x1,l,x1,u] and R ′ i = [1,Ti], for 2 ≤ i ≤ D.<br />

Then, for two ciphertexts C ∗ 1 and C ∗ 2 , where the two points<br />

X ∗ 1,X ∗ 2 ∈ HRQ ′ but X∗ 1,X ∗ 2 /∈ HRQ, this adversary is<br />

able to decrypt C ∗ 1 and C ∗ 2 , which means it can distinguish<br />

the corresponding messages M ∗ 1 and M ∗ 2 . It also indicates<br />

that, for the two points X ∗ 1,X ∗ 2 /∈ HRQ, this adversary is<br />

able to distinguish M∗ 1 and M∗ 2 based on the decryption key<br />

SK SKQ, which means MRPE is not selectively secure. Clearly, it<br />

would conflict the fact that our scheme is selectively secure.<br />

Therefore, as long as our scheme is selectively secure, then our<br />

scheme is able to provide multi-dimensional confidentiality.<br />

Theorem 3: (Single <strong>Dimensional</strong> Query <strong>Privacy</strong>). If all<br />

the data records in our scheme are indexed with a kd-tree<br />

and our scheme is selectively secure, then it achieves single<br />

dimensional query privacy.


Setup(λ, {T1,...,TD}): Given security parameter λ and upper<br />

bound Ti in each dimension, build a lattice L∆ = [T1]×···×[TD]<br />

and a lattice L2∆ = [T1]×[T1]×···×[TD]×[TD]; compute<br />

(PK1,MK1) ←MRPE.Setup(λ,L∆),<br />

(PK2,MK2) ←MRPE O .Setup(λ,L∆),<br />

(PK3,MK3) ←MRPE O .Setup(λ,L2∆),<br />

(PK4,MK4) ←MRPE O .Setup(λ,L2∆);<br />

and output public key PK = {PK1,PK2,PK3,PK4} and master key<br />

MK = {MK1,MK2,MK3,MK4}.<br />

Build({M1,...,Mn}, {A1,...,An}): Given all the data records<br />

{M1,...,Mn} and their attributes {A1,...,An}, output kd-tree Γ.<br />

Encrypt(PK PK PK, {M1,...,Mn}, Γ): Given public key PK =<br />

{PK1,PK2,PK3,PK4}, all the data records {M1,...,Mn} and kdtree<br />

Γ, encrypt each node in kd-tree Γ.<br />

• If it is a non-leaf node, let HRL and HRR be the two hyperrectangles<br />

of this node; transform HRL = (R1,..,RD),<br />

where Ri = [xi,l,xi,u], into a point XL = (x1,...,x2D) =<br />

(x1,l,x1,u,...,xD,l,xD,u), transform HRR into a point XR<br />

in the same way; compute<br />

C3,L ←MRPE O .Encrypt(PK3,XL,Flag),<br />

C3,R ←MRPE O .Encrypt(PK3,XR,Flag),<br />

C4,L ←MRPE O .Encrypt(PK4,XL,Flag),<br />

C4,R ←MRPE O .Encrypt(PK4,XR,Flag),<br />

and output C = {C3,L,C3,R,C4,L,C4,R}.<br />

• If it is a leaf node, it is a data record Mi, i ∈ [1,n], and<br />

its attribute vector is Ai = (ai,1,...,ai,D); set a point X<br />

as X = (x1,...,xD) = (ai,1,...,ai,D), generate a random<br />

symmetric key Ki, then encrypt as<br />

Ci,0 ←Enc(Ki,Mi),<br />

Ci,1 ←MRPE.Encrypt(PK1,X,Ki,Flag),<br />

Ci,2 ←MRPE O .Encrypt(PK2,X,Flag),<br />

and output Ci = {Ci,0,Ci,1,Ci,2}.<br />

Output outsourced kd-tree Γ ∗ after encrypting all the nodes.<br />

Extract(MK MK MK,Q): Given master key MK = {MK1,MK2,MK3,MK4}<br />

and a multi-dimensional range search query Q = [x1,l,x1,u] ∧<br />

Proof: <strong>Preserving</strong> single dimensional query privacy in<br />

an MDRSE includes two aspects. First, the leveraged multidimensional<br />

data structure itself will not reveal additional<br />

query privacy in single-dimensions. Second, a data user (or<br />

the cloud server) is not able to decompose the search token of<br />

a multi-dimensional range query into the ones in each single<br />

dimension.<br />

For the first aspect, because all the data records in our<br />

scheme are indexed based on the structure of a kd-tree, where<br />

the search decision at each node depends on the ranges in all<br />

the dimensions, therefore, an adversary is not able to reveal<br />

single dimensional query privacy from the data structure of a<br />

kd-tree.<br />

Meanwhile, we can also prove that if an adversary is able<br />

to decompose the search token of a multi-dimensional range<br />

query into each single dimension, then it will conflict the fact<br />

that our scheme is selectively secure.<br />

More specifically, given a multi-dimensional range search<br />

query Q = [x1,l,x1,u] ∧ [x2,l,x2,u] ∧ ···∧[xD,l,xD,u], the<br />

Fig. 8. Details of our scheme based on kd-trees.<br />

··· ∧ [xD,l,xD,u], transform Q into a hyper-rectangle HR =<br />

(R1,...,RD), where Ri = [xi,l,xi,u] and 1 ≤ i ≤ D; generate<br />

a hyper-rectangle HR ′ = (R ′ 1,...,R ′ 2D), where R ′ 2i−1 =<br />

[xi,l,xi,u] and R ′ 2i = [xi,l,xi,u], for 1 ≤ i ≤ D, and a hyperrectangle<br />

HR ′′ = (R ′′<br />

1,...,R ′′<br />

2D), where R ′′<br />

2i−1 = [1,xi,u] and<br />

R ′′<br />

2i = [xi,l,Ti], for 1 ≤ i ≤ D; compute<br />

SK1 ←MRPE.DeriveKey(MK1,HR),<br />

TK2 ←MRPE O .DeriveKey(MK2,HR),<br />

TK3 ←MRPE O .DeriveKey(MK3,HR ′ ),<br />

TK4 ←MRPE O .DeriveKey(MK4,HR ′′ );<br />

and return search token TK TKQ = {TK2,TK3,TK4} and decryption<br />

key SK SKQ = SK1.<br />

<strong>Search</strong>(TK TK TKQ, Γ ∗ ): Given search token TK TKQ = {TK2,TK3,TK4}<br />

and outsourced kd-tree Γ ∗ , search from the root node of Γ ∗ and<br />

perform as follows:<br />

• If it is a non-leaf node C = {C3,L,C3,R,C4,L,C4,R}, compute<br />

hL ← MRPE O .Decrypt(PK3,TK3,C3,L), if hL = 1,<br />

report all the nodes in the left subtree; otherwise compute<br />

gL ← MRPE O .Decrypt(PK4,TK4,C4,L), if gL = 1, continue<br />

to search the left subtree; otherwise stop to search the<br />

left subtree;<br />

compute hR ←MRPE O .Decrypt(PK3,TK3,C3,R), if hR =<br />

1, report all the nodes in the right subtree; otherwise compute<br />

gR ← MRPE O .Decrypt(PK4,TK4,C4,R), if gR = 1,<br />

continue to search the right subtree; otherwise stop to search<br />

the right subtree;<br />

• If it is a leaf node Ci = {Ci,0,Ci,1,Ci,2}, compute<br />

f ← MRPE O .Decrypt(PK2,TK2,Ci,2), if f = 1, return<br />

Ci; otherwise, do not return Ci.<br />

Decrypt(SK SK SKQ,Ci): Given decryption keySK SK SKQ and encrypted data<br />

record Ci = {Ci,0,Ci,1,Ci,2}, compute<br />

Ki ←MRPE.Decrypt(PK1,SKHR,Ci,1),<br />

Mi ← Dec(Ki,Ci,0) = Dec(Ki,Enc(Ki,Mi)),<br />

and output data record Mi, if attribute set Ai ∈ Q; and ⊥<br />

otherwise.<br />

data owner generates a search token TK TKQ on this query Q for<br />

the adversary. This search query Q can be represented as a<br />

hyper-rectangle HRQ = {R1,R2,...,RD}, Ri = [xi,l,xi,u],<br />

for 1 ≤ i ≤ D.<br />

We first assume the adversary is able to search with TK TKQ<br />

on a single-dimensional range search query Q ′ = [x1,l,x1,u],<br />

which can be represented as HRQ ′ = {R1,R ′ 2,...,R ′ D },<br />

R1 = [x1,l,x1,u] and R ′ i = [1,Ti], for 2 ≤ i ≤ D.<br />

Then, for two ciphertexts C ∗ 1 and C ∗ 2 , where the two points<br />

X ∗ 1,X ∗ 2 ∈ HRQ ′ but X∗ 1,X ∗ 2 /∈ HRQ, this adversary is<br />

able to decrypt C ∗ 1 and C ∗ 2 , which means it can distinguish<br />

the corresponding messages M ∗ 1 and M ∗ 2 . It also indicates<br />

that, for the two points X ∗ 1,X ∗ 2 /∈ HRQ, this adversary is<br />

able to distinguish M∗ 1 and M∗ 2 based on the search token<br />

TK TKQ, which means MRPE is not selectively secure. Clearly, it<br />

would conflict the fact that our scheme is selectively secure.<br />

Therefore, as long as our scheme is selectively secure, an<br />

adversary is not able to decompose the search token of a multidimensional<br />

range query into each single dimension.


V. MULTI-DIMENSIONAL RANGE SEARCH OVER<br />

ENCRYPTED DATA (MDRSE) ON RANGE TREES<br />

A. Overview<br />

Generally, different multi-dimensional data structures in the<br />

plaintext domain can be used to achieve different tradeoffs<br />

between search time and storage <strong>over</strong>head. Ideally, we would<br />

like to have the capability to make the same tradeoff <strong>over</strong><br />

encrypted data. In fact, we can also use the similar methodology<br />

in the last section to design another MDRSE with an<br />

even faster (polylogarithmic) search time based on range trees<br />

[12]. According to the tradeoff, which we shown in Section<br />

III, the scheme based on a range tree requires much more<br />

storage <strong>over</strong>head than the one based on a kd-tree.<br />

Specifically, we can exactly follow the search algorithm of<br />

a range tree in plaintext when all the nodes in a range tree are<br />

encrypted with the utilization of hyper-rectangle intersection<br />

predicate encryption and another novel predicate encryption,<br />

named hyper-rectangle containing predicate encryption.<br />

With the hyper-rectangle containing predicate encryption, if<br />

a message is encrypted with a hyper-rectangle HR, then it<br />

can be only decrypted with a decryption key generated by a<br />

hyper-rectangle HR ′ , if HR ⊇ HR ′ . Note that this hyperrectangle<br />

containing predicate encryption is different from the<br />

hyper-rectangle contained predicate encryption designed in the<br />

previous section.<br />

However, because the search algorithm of a range tree<br />

itself can be operated on each single dimension independently,<br />

this scheme based on range trees is not able to achieve<br />

single dimensional query privacy. For example, given a multidimensional<br />

range query Q = [x1,x2] ∧ [y1,y2], the search<br />

results of it after X-dimension is exactly the same as the search<br />

result of the single dimensional range query Qx = [x1,x2], no<br />

matter all the nodes in the range tree are encrypted or not. Note<br />

that single dimensional confidentiality in the range tree-based<br />

scheme is still preserved.<br />

Actually, we also believe our design methodology can be<br />

leveraged into other multi-dimensional data structures, such as<br />

R-trees [18] and Quadtrees [19], whose search algorithm also<br />

depends on the geometry relationships of hyper-rectangles.<br />

B. Introduction to <strong>Range</strong> Trees<br />

Before describing our MDRSE based on range trees, we first<br />

briefly introduce how to search on a range tree. Without loss of<br />

generality, we present the following examples and algorithms<br />

with the two-dimension, which can be easily extended into the<br />

multi-dimensional case. An example of a range tree is listed<br />

in Fig. 9. In a 2-dimensional range tree, all the nodes are<br />

first organized in x-dimension as a balanced binary tree. Then<br />

for each non-leaf node, it also has a second level tree (still<br />

a balanced binary tree), where the nodes under this non-leaf<br />

node are organized based on y-dimension.<br />

The essential idea to search on a range tree is to search<br />

each single dimension one by one. Specifically, given range<br />

query Q = [x1,x2]∧[y1,y2], we first concentrate on finding<br />

the points whose x-dimension lies in [x1,x2], and then we<br />

report those points whose y-dimension also lies in [y1,y2]. As<br />

a result, for the search on each single dimension (x-dimension<br />

or y-dimension), it is a 1-dimensional range query, which is<br />

described as follows.<br />

Given range query Qx = [x1,x2], we search with [x1,x2] in<br />

the x-dimension until we get to a node vsplit where the search<br />

paths split. From the left child of vsplit, we continue the search<br />

with x1, and at every node v where the search path of x goes<br />

left, we report all points in the right subtree of v. Similarly,<br />

we continue the search with x2 at the right child of vsplit, and<br />

at every node v where the search path of x ′ goes right, we<br />

report all points in the left subtree of v. Finally, we check the<br />

leaves µ and µ ′ where the two paths end to see if they contain<br />

a point in the range. The search process is also illustrated in<br />

Fig.9(c).<br />

Based on the range search algorithm on 1-dimension, the<br />

search algorithm of a 2-dimensional range tree is presented in<br />

Algorithm 2. Further details about range trees can be found<br />

in [11].<br />

Algorithm 2: 2D<strong>Range</strong>Query(v, Q)<br />

Input: A 2D range tree Γ and a query Q = [x1,x2]∧[y1,y2].<br />

Output: All points in Γ that lie in Q.<br />

1 vsplit ← FindSplitNode(Γ,[x1,x2]);<br />

2 if vsplit is a leaf then<br />

3 Report it if the point stored at vsplit lies in Q.<br />

4 else<br />

5 v ← LeftChild(vsplit);<br />

6 while v is not a leaf do<br />

7 if x1 ≤ xv then<br />

8 1D<strong>Range</strong>Query(Γassoc(RightChild(v)),[y1,y2]);<br />

9 v ← LeftChild(v);<br />

10 else<br />

11 v ← RightChild(v);<br />

12 Report the leaf (where the path ends) if the point stored at<br />

vsplit lies in Q;<br />

13 Similarly, follow the path from RightChild(vsplit) to x2,<br />

call 1D<strong>Range</strong>Query with the range [y1,y2] on the<br />

associate structures of subtrees Γassoc(LeftChild(v)), which<br />

is on the left of the path, and report a point at the leaf<br />

(where the path ends) if it lies in Q.<br />

C. Challenges<br />

As we can see from Algorithm 2, by doing comparison (i.e.<br />

if x1 ≤ xv) in 1-dimensional range query, it is more easier<br />

and simpler to make decisions at nodes in a range tree than<br />

verifying geometry relationships of two hyper-rectangles at<br />

nodes in a kd-tree. However, during the design of an MDRSE<br />

based on range trees, if we directly follow the one-by-one<br />

search on each single dimension of a range tree in plaintext<br />

domain, we have to decompose search tokens and decryption<br />

tokens <strong>over</strong> encrypted data into each single dimension, which<br />

will no doubt reveal more information to data users than<br />

they are authorized.<br />

In order to leverage the similar methodology in the last<br />

section to solve the above problem, we first transform the<br />

decision conditions at nodes in a range tree into geometry<br />

relationships of two rectangles in the two-dimension (two


X-dimension<br />

(first level tree)<br />

Y-dimension<br />

(second level tree)<br />

(a) 2D range tree, where each non-leaf node in xdimension<br />

has a second level tree on y-dimension.<br />

X-dimension<br />

(first level tree)<br />

Two Rectangles<br />

HRL = ([1,3],[1,Ty])<br />

HRR = ([3,6],[1,Ty])<br />

1<br />

3<br />

1,5 3,8 4,2<br />

hyper-rectangles in the multi-dimension). This transformation<br />

makes the decision conditions more complicated than the<br />

original ones, but will offer us an opportunity to design an<br />

MDRSE based on range trees.<br />

Specifically, similar as a non-leaf node in a kd-tree, a nonleaf<br />

node in a range tree can also be represented with two<br />

rectangles on each level tree (shown in Fig. 9(b)) and the range<br />

query can be always denoted as a rectangle HRQ. Then, we<br />

can find a method to decide the split node in a range tree by<br />

verifying whether a rectangle fully c<strong>over</strong>s another one. Details<br />

are illustrated in the following algorithm. To distinguish the<br />

original one, we denote it as FindSplitNode ∗ . Note that the<br />

input of the original one is the range treeΓand a range[x1,x2]<br />

in a single dimension, while the input of ours is the range tree<br />

Γ and the entire range query Q in the multi-dimension.<br />

Algorithm 3: FindSplitNode ∗ (Γ, Q)<br />

Input: A 2D range tree (or its subtree) Γ, and a query<br />

Q = [x1,x2]∧[y1,y2], which can be transformed as a<br />

rectangle HRQ = [Rx,Ry], where Rx = [x1,x2] and<br />

Ry = [y1,y2];<br />

Output: The split node vsplit in a single dimension.<br />

1 v is the root (or a node) of Γ;<br />

2 if v is a leaf then<br />

3 v is the split node.<br />

4 else if v is a non-leaf then<br />

5 if LeftRectangle(v) fully c<strong>over</strong>s HRQ then<br />

6 FindSplitNode ∗ (LeftSubtree(v),Q);<br />

7 else if RightRectangle(v) fully c<strong>over</strong>s HRQ then<br />

8 FindSplitNode ∗ (RightSubtree(v),Q);<br />

9 else<br />

10 v is the split node;<br />

With Algorithm FindSplitNode ∗ , we then transform the<br />

original search algorithm of a range tree by verifying geometry<br />

relationships of two rectangles at nodes. Similar, to<br />

distinguish the original one, we denote ours as 2D<strong>Range</strong>-<br />

Query ∗ . The search rule of 1D<strong>Range</strong>Query ∗ is the same<br />

as 2D<strong>Range</strong>Query ∗ , except when LeftRectangle(v) intersects<br />

HRQ, 1D<strong>Range</strong>Query ∗ reports all the nodes on the right<br />

subtree and continues to search on the left subtree. Clearly, we<br />

can easily extend 2D<strong>Range</strong>Query ∗ into its the D-dimensional<br />

version DD<strong>Range</strong>Query ∗ .<br />

4<br />

6,9<br />

Y-dimension<br />

(second level tree)<br />

5<br />

8<br />

2<br />

6,9<br />

3,8<br />

1,5<br />

4,2<br />

Two Rectangles<br />

HRL = ([1,6],[2,5])<br />

HRR = ([1,6],[5,9])<br />

(b) A first level tree and its second level tree. Each<br />

non-leaf node in x-dimension (or y-dimension) can be<br />

represented as two rectangles.<br />

Fig. 9. An example of a range tree.<br />

3<br />

10<br />

23<br />

19 30<br />

37<br />

49<br />

70 89 93<br />

3 10 19 23 30 37 59 62 70 80 93<br />

49<br />

59<br />

62<br />

Split Node<br />

(c) Find split node in a single dimension with<br />

query Q = [61,90].<br />

Algorithm 4: 2D<strong>Range</strong>Query ∗ (v,Q)<br />

Input: A 2D range tree Γ, and a query Q = [x1,x2]∧[y1,y2],<br />

which can be transformed as a rectangle<br />

HRQ = [Rx,Ry], where Rx = [x1,x2] and<br />

Ry = [y1,y2];<br />

Output: All points in Γ that lie in Q.<br />

1 vsplit ← FindSplitNode ∗ (Γ,Q);<br />

2 if vsplit is a leaf then<br />

3 Report it if the point X stored at vsplit lies in HRQ.<br />

4 else<br />

5 v ← LeftChild(vsplit);<br />

6 while v is not a leaf do<br />

7 if LeftRectangle(v) intersects HRQ then<br />

8 1D<strong>Range</strong>Query ∗ (Γassoc(RightChild(v)),Q);<br />

9 v ← LeftChild(v);<br />

10 else<br />

11 v ← RightChild(v);<br />

12 Report the leaf (where the path ends) if the point X stored<br />

at vsplit lies in HRQ;<br />

13 Similarly, follow the path from RightChild(vsplit) to some<br />

leaf, call 1D<strong>Range</strong>Query ∗ with the query Q on the<br />

associate structures of subtrees Γassoc(LeftChild(v)), which<br />

is on the left of the path, and report a point at the leaf<br />

(where the path ends) if it lies in HRQ.<br />

D. Hyper-Rectangle Containing Predicate Encryption<br />

As we argued at the beginning of this section, except hyperrectangle<br />

intersection predicate encryption, which is able to<br />

decide whether two rectangles intersect, we also need another<br />

predicate encryption, named hyper-rectangle containing predicate<br />

encryption, to decide whether a hyper-rectangle HR fully<br />

contains another hyper-rectangle HR ′ based on encrypted<br />

data.<br />

Once again, for the ease of description, we start to explain<br />

hyper-rectangle containing predicate encryption with two<br />

ranges in a single dimension. Given two ranges R and R ′ in<br />

a single dimension, we transform the first range R = [xl,xu]<br />

into a point in a two-dimension, and convert the second range<br />

R ′ = [x ′ l ,x′ u] into a rectangle in a two-dimension. Then,<br />

we can decide whether range R fully c<strong>over</strong>s range R ′ by<br />

verifying whether this two dimensional point is in the twodimensional<br />

rectangle. As shown in Fig. 10 and Lemma 5,<br />

the transformation of range R into a point is the same as<br />

hyper-rectangle intersection predication encryption and hyperrectangle<br />

contained predicate encryption, but the transforma-<br />

80<br />

89<br />

97


tion of range R ′ into the two-dimensional rectangle is neither<br />

the same as the way in hyper-rectangle intersection predicate<br />

encryption nor the approach in hyper-rectangle contained<br />

predicate encryption.<br />

x1,l<br />

x2,l<br />

x3,l<br />

x ′ l<br />

x1,u<br />

x ′ u<br />

x3,u<br />

x2,u<br />

1 T<br />

<strong>Range</strong>s fully c<strong>over</strong> R ′ = [x ′ l ,x′ u]<br />

<strong>Range</strong>s fully c<strong>over</strong> R ′ = [x ′ l ,x′ u]<br />

x4,l x4,u<br />

Fig. 10. Examples of whether one range fully c<strong>over</strong>s another one.<br />

Lemma 5: For two ranges R = [xl,xu] and R ′ = [x ′ l ,x′ u],<br />

we have a point X = (x1,x2) = (xl,xu) based on range R<br />

and a rectangle HR ′ = (R ′ 1,R ′ 2) based on range R ′ , where<br />

R ′ 1 = [1,x ′ l ] and R′ 2 = [x ′ u,T]. If X ∈ HR ′ , then R ⊇ R ′ ;<br />

otherwise, X /∈ HR ′ , then R R ′ .<br />

The correctness of this lemma can be explained as below<br />

R ⊇ R ′ ⇔<br />

xl ∈ [1,x ′ l ]<br />

xu ∈ [x ′ u,T] ⇔ X ∈ HR′ .<br />

By extending Lemma 5, we can decide whether a hyperrectangle<br />

fully c<strong>over</strong>s another hyper-rectangle in the multidimension.<br />

Lemma 6: For two hyper-rectangles HR = (R1,...,RD)<br />

and HR ′ = (R ′ 1,...,R ′ D ), where Ri = [xi,l,xi,u] and R ′ i =<br />

[x ′ i,l ,x′ i,u ], for i ∈ [D], we have a point X = (x1,...,x2D) =<br />

(x1,l,x1,u,...,xD,l,xD,u) and a hyper-rectangle HR ′′ =<br />

(R ′′<br />

1,...,R ′′<br />

2D ), where R′′ 2i−1 = [1,x′ i,l ] and R′′ 2i = [x′ i,u ,T],<br />

for i ∈ [D]. If X ∈ HR ′′ , then HR ⊇ HR ′ ; otherwise,<br />

X /∈ HR ′′ , then HR HR ′ .<br />

Then, by transforming two D-dimensional hyper-rectangles<br />

HR, HR ′ into a 2D-dimensional point and a 2D-dimensional<br />

hyper-rectangle respectively based on the above lemma, we<br />

can encrypt a message with this 2D-dimensional point (transformed<br />

from HR) using MRPE, and this message can<br />

only be decrypted with a decryption key generated by a<br />

2D-dimensional hyper-rectangle (transformed from HR ′ ), if<br />

HR ⊇ HR ′ . Clearly, this hyper-rectangle containing predicate<br />

encryption is selectively secure if MRPE is selectively<br />

secure.<br />

E. Construction based on <strong>Range</strong> Trees<br />

Using these two predicate encryptions together, we are able<br />

to build an MDRSE within polylogarithmic search time based<br />

on range trees <strong>over</strong> encrypted data. As we can see from the<br />

following box, during the encryption, each non-leaf node is encrypted<br />

with hyper-rectangle containing predicate encryption<br />

and hyper-rectangle intersection predicate encryption together;<br />

and each leaf node is still encrypted with the combination<br />

of symmetric key encryption, MRPE and its predicate-only<br />

version, which is the same as the case in kd-trees.<br />

During the search, our scheme exactly follows<br />

FindSplitNode ∗ and DD<strong>Range</strong>Query ∗ by computing hi,L,<br />

hi,R, gi,L, gi,R, and f respectively when it is necessary. For<br />

instance, when we need to decide whether LeftRectangle(v)<br />

fully c<strong>over</strong>s HRQ in FindSplitNode ∗ , we compute the<br />

corresponding hi,L of this node v, if hi,L = 1 (indicates<br />

LeftRectangle(v) fully c<strong>over</strong>s HRQ), then we continue to<br />

find split node in the left subtree. By doing these, we can<br />

exactly follow the search algorithm of a range tree when all<br />

the nodes are encrypted.<br />

F. Security and <strong>Privacy</strong> Analysis<br />

With the same approach in the previous section, we can<br />

prove that our scheme (range-tree) is also selectively secure<br />

and single dimensional confidentiality.<br />

Theorem 4: (Our Scheme (range trees)-selective security.)<br />

Our proposed scheme (range trees) is selectively secure<br />

against polynomial-time adversaries, as long as MRPE is<br />

selectively secure against polynomial-time adversaries.<br />

Theorem 5: (Single <strong>Dimensional</strong> Confidentiality). If our<br />

proposed scheme (range trees) is selectively secure, then it<br />

preserves single dimensional confidentiality.<br />

Although the search tokens are not able to be decomposed<br />

into single dimensions in our scheme (range tree), however, it<br />

is not able to achieve single dimensional query privacy. It is<br />

because the data structure of a range tree itself will inevitably<br />

reveal single dimensional query privacy. Now let us explain it<br />

with the following example.<br />

We first assume a data user obtains a search token TK TKQ<br />

based on a two-dimensional range query Q = [x1,x2] ∧<br />

[y1,y2]. Then this data user can submit his search tokenTK TK TKQ to<br />

the cloud, and the cloud server will search on the outsourced<br />

range tree Γ∗ with TK TKQ. If the cloud server colludes with<br />

this data user, stops the search algorithm right after finishing<br />

search on X-dimension, and directly returns the search results<br />

on X-dimension to the data user, then these results are exactly<br />

the same as the search results of a single dimensional range<br />

query Qx = [x1,x2], which means the data user can search on<br />

a range tree for an X-dimensional range query with a search<br />

token of a multi-dimensional query. To search independently<br />

on Y-dimension, the cloud server can directly jump to the<br />

y-dimension tree (second level tree) without searching on Xdimension.<br />

Then, by doing these, the search results after Ydimension<br />

ofQis exactly the same asQy = [y1,y2], no matter<br />

the nodes are encrypted or not.<br />

Well, the good news is to compromise the single dimensional<br />

query privacy, this data user needs to collaborate with<br />

the cloud server. In addition, this data user is not able to<br />

decrypt those results on single dimensions because single<br />

dimensional confidentiality is still protected. The reason of this<br />

privacy leak is caused by the fact that the search algorithm of<br />

a range tree is operated on each single dimension one after<br />

another. When performing search on x-dimension, the ranges<br />

of other dimensions will not affect the search decisions at<br />

nodes in x-dimension. Therefore, as long as two queries have<br />

the same range on x-dimension, the search results after xdimension<br />

(first level tree) is exactly the same, no matter


Setup(λ, {T1,...,TD}): Given security parameter λ and upper<br />

bound Ti in each dimension, build a lattice L∆ = [T1]×···×[TD]<br />

and a lattice L2∆ = [T1]×[T1]×···×[TD]×[TD]; compute<br />

(PK1,MK1) ←MRPE.Setup(λ,L∆),<br />

(PK2,MK2) ←MRPE O .Setup(λ,L∆),<br />

(PK3,MK3) ←MRPE O .Setup(λ,L2∆),<br />

(PK4,MK4) ←MRPE O .Setup(λ,L2∆);<br />

and output public key PK = {PK1,PK2,PK3,PK4} and master key<br />

MK = {MK1,MK2,MK3,MK4}.<br />

Build({M1,...,Mn}, {A1,...,An}): Given all the data records<br />

{M1,...,Mn} and their attributes {A1,...,An}, output range tree<br />

Γ.<br />

Encrypt(PK PK PK, {M1,...,Mn}, Γ): Given public key PK =<br />

{PK1,PK2,PK3,PK4}, all the data records {M1,...,Mn} and kdtree<br />

Γ, encrypt each node in range tree Γ.<br />

• If it is a non-leaf node, let HRi,L and HRi,R be the two<br />

hyper-rectangles represented on the i-level tree, where i ∈<br />

[1,D]; transform HRi,L = (Ri,1,..,Ri,D), where Ri,j =<br />

[xi,j,l,xi,j,u], into a point Xi,L = (xi,1,...,xi,2D) =<br />

(xi,1,l,xi,1,u,...,xi,D,l,xi,D,u), transform HRi,R into a<br />

point Xi,R in the same way; compute<br />

Ci,3,L ←MRPE O .Encrypt(PK3,Xi,L,Flag),<br />

Ci,3,R ←MRPE O .Encrypt(PK3,Xi,R,Flag),<br />

Ci,4,L ←MRPE O .Encrypt(PK4,Xi,L,Flag),<br />

Ci,4,R ←MRPE O .Encrypt(PK4,Xi,R,Flag),<br />

and output C = {C1,...,CD}, where Ci =<br />

{Ci,3,L,Ci,3,R,Ci,4,L,Ci,4,R} for the i-level tree of this<br />

non-leaf node.<br />

• If it is a leaf node, it is a data record Mi, i ∈ [1,n], and<br />

its attribute vector is Ai = (ai,1,...,ai,D); set a point X<br />

as X = (x1,...,xD) = (ai,1,...,ai,D), generate a random<br />

symmetric key Ki, then encrypt as<br />

Ci,0 ←Enc(Ki,Mi),<br />

Ci,1 ←MRPE.Encrypt(PK1,X,Ki,Flag),<br />

Ci,2 ←MRPE O .Encrypt(PK2,X,Flag),<br />

whether the nodes in the tree are encrypted or not.<br />

VI. PERFORMANCE<br />

In this section, we analyze the performance of our scheme,<br />

evaluate the performance in experiments based on a large-scale<br />

real-world dataset.<br />

A. Complexity Analysis<br />

In our scheme, we utilize a kd-tree to achieve sublinear<br />

search time <strong>over</strong> encrypted data. The search complexity of a<br />

kd-tree is O(n 1−1/D ), where n is the number of data records<br />

and D is the number of dimensions. We encrypt nodes in a<br />

kd-tree with two novel predicate encryptions based on MRPE<br />

[8], where the decryption cost at each node can be optimized<br />

to O(DlogT). Therefore, the total search time of our scheme<br />

is O(n 1−1/D · DlogT). According to the structure of a kdtree,<br />

the storage cost of a kd-tree for a number of n records<br />

is O(n). Considering the size of an encrypted value under<br />

MRPE is O(DlogT) [8], the total storage cost of our scheme<br />

is O(n·DlogT) as shown in Table I.<br />

Fig. 11. Details of our scheme based on range trees.<br />

and output Ci = {Ci,0,Ci,1,Ci,2}.<br />

Output outsourced range tree Γ ∗ after encrypting all the nodes.<br />

Extract(MK MK MK,Q): Given master key MK = {MK1,MK2,MK3,MK4}<br />

and a multi-dimensional range search query Q = [x1,l,x1,u] ∧<br />

··· ∧ [xD,l,xD,u], transform Q into a hyper-rectangle HR =<br />

(R1,...,RD), where Ri = [xi,l,xi,u] and 1 ≤ i ≤ D; generate a<br />

hyper-rectangle HR ′ = (R ′ 1,...,R ′ 2D), where R ′ 2i−1 = [1,xi,l]<br />

and R ′ 2i = [xi,u,Ti], for 1 ≤ i ≤ D, and another hyperrectangle<br />

HR ′′ = (R ′′<br />

1,...,R ′′<br />

2D), where R ′′<br />

2i−1 = [xi,l,xi,u] and<br />

R ′′<br />

2i = [xi,l,xi,u], for 1 ≤ i ≤ D, then compute<br />

SK1 ←MRPE.DeriveKey(MK1,HR),<br />

TK2 ←MRPE O .DeriveKey(MK2,HR),<br />

TK3 ←MRPE O .DeriveKey(MK3,HR ′ ),<br />

TK4 ←MRPE O .DeriveKey(MK4,HR ′′ );<br />

and return search token TK TKQ = {TK2,TK3,TK4} and decryption<br />

key SK SKQ = SK1.<br />

<strong>Search</strong>(TK TK TKQ, Γ ∗ ): Given search token TK TKQ = {TK2,TK3,TK4}<br />

and outsourced range tree Γ ∗ , search from the root node<br />

of Γ ∗ by exactly following FindSplitNode ∗ (Γ ∗ ,TK TK TKQ) and<br />

DD<strong>Range</strong>Query ∗ (Γ ∗ ,TK TK TKQ), and compute hi,L,hi,R,gi,L,<br />

gi,R,f when it is necessary, where<br />

hi,L ←MRPE O .Decrypt(PK3,TK3,Ci,3,L),<br />

hi,R ←MRPE O .Decrypt(PK3,TK3,Ci,3,R),<br />

gi,L ←MRPE O .Decrypt(PK4,TK4,Ci,4,L),<br />

gi,R ←MRPE O .Decrypt(PK4,TK4,Ci,3,R),<br />

f ←MRPE O .Decrypt(PK2,TK2,Ci,2);<br />

and report points that lies in Q.<br />

Decrypt(SK SK SKQ,Ci): Given decryption keySK SK SKQ and encrypted data<br />

record Ci = {Ci,0,Ci,1,Ci,2}, compute<br />

Ki ←MRPE.Decrypt(PK1,SKHR,Ci,1),<br />

Mi ← Dec(Ki,Ci,0) = Dec(Ki,Enc(Ki,Mi)),<br />

and output data record Mi, if attribute set Ai ∈ Q; and ⊥<br />

otherwise.<br />

B. Experimental Results<br />

Now we evaluate the performance of our proposed scheme,<br />

and compare it with the straightforward solution in experiments.<br />

All the experiments are tested under Ubuntu with<br />

Intel Core i5 3.30 GHz Processor and 4 GB Memory. We<br />

leverage Pairing-Based Library (PBC) 1 to simulate the cost of<br />

cryptographic operations. We use a large-scale real world data<br />

set with millions of records (U.S. census 1990 2 ) to evaluate the<br />

search performance of our scheme and the straightforward solution<br />

[8]. In the experiments, we test the search performance<br />

<strong>over</strong> 200 times with the input of random multi-dimensional<br />

range search queries.<br />

As we can see from Fig. 12(a) and 12(b), the average<br />

search time in the straightforward solution increases linearly<br />

with the number of records when the number of dimensions<br />

is fixed. While the average search time of our scheme also<br />

increases with the number of records, however, it increases<br />

1 crypto.stanford.edu/pbc/<br />

2 http://archive.ics.uci.edu/ml/datasets.html


Ave. <strong>Search</strong> Time (s)<br />

9000<br />

8000<br />

7000<br />

6000<br />

5000<br />

4000<br />

3000<br />

2000<br />

1000<br />

Our Scheme<br />

Straightforward<br />

0<br />

10 20 30 40 50<br />

n: the number of records (thousand)<br />

(a) Average search time (second)<br />

when D = 4.<br />

Ave. <strong>Search</strong> Time (s)<br />

3000<br />

2500<br />

2000<br />

1500<br />

1000<br />

500<br />

0<br />

2 3 4 5<br />

D: the number of dimensions<br />

Ave. <strong>Search</strong> Time (s)<br />

1600<br />

1400<br />

1200<br />

1000<br />

800<br />

600<br />

400<br />

200<br />

0<br />

Our Scheme<br />

Straightforward<br />

10 20 30 40 50<br />

n: the number of records (thousand)<br />

(b) Average search time (second)<br />

when D = 2.<br />

Fig. 12. Impact of n on average search time.<br />

20<br />

Our Scheme<br />

Straightforward<br />

15<br />

Our Scheme<br />

Straightforward<br />

(a) Average search time (second)<br />

when n = 10,000.<br />

Ave. <strong>Search</strong> Time (ks)<br />

10<br />

5<br />

0<br />

2 3 4 5<br />

D: the number of dimensions<br />

(b) Average search time (kilosecond)<br />

when n = 100,000.<br />

Fig. 13. Impact of D on average search time.<br />

much slower than the straightforward solution. Therefore, our<br />

scheme can finish search much faster than the straightforward<br />

solution under the same parameters. For instance, when D = 4<br />

and n = 50,000, the straightforward solution requires 7,377<br />

seconds in average to finish each multi-dimensional range<br />

search query; with our solution, the same search query can<br />

be done within 1,537 seconds in average. We can also find in<br />

Fig. 13(a) and 13(b) that our scheme has a better performance<br />

than the straightforward solution.<br />

In Table II, we compare the performance of our scheme<br />

and the straightforward solution under different scales of data<br />

records. Compared to the straightforward solution, our scheme<br />

is able to save a huge of time on multi-dimensional range<br />

search queries. Especially, when the total of the data records<br />

is one million and the number of dimensions is D = 4, our<br />

scheme is able to finish search within 5.33×10 3 seconds in<br />

average while the average search time of the straightforward<br />

solution is 152.03×10 3 seconds.<br />

TABLE II<br />

AVERAGE SEARCH TIME (KILOSECOND) WHEN D = 4.<br />

n Our Scheme Straightforward<br />

10,000 0.47 1.59<br />

100,000 3.25 15.44<br />

1,000,000 5.33 152.03<br />

As we mentioned in the previous section, we can improve<br />

search efficiency by leveraging range trees. For instance, with<br />

a number of 4,096 records, the search time <strong>over</strong> encrypted<br />

data with range trees is only around 10.96 seconds while the<br />

one with kd-trees is about 161.03 seconds.<br />

TABLE III<br />

AVERAGE SEARCH TIME (SECOND) WHEN D = 4.<br />

n With kd-trees With range-trees<br />

1,024 82.10 8.89<br />

2,048 142.33 9.46<br />

4,096 161.03 10.96<br />

Note that since the main cryptographic operation during<br />

the decryption of MRPE is the pairing operation, we can use<br />

some optimized method [20] on the computation of pairing<br />

to improve the current search performance. More specifically,<br />

by using an efficient method [20] for computing pairing<br />

operations, we can reduce the time of computing pairing from<br />

6.33 ms to 1.69 ms, which can significantly improve the<br />

current search performance of our scheme. For instance, with<br />

the optimization on the computation of pairing operations,<br />

when the number of data records is one million and the<br />

number of dimension of search queries is D = 4, then the<br />

average search time <strong>over</strong> encrypted data can be reduced to<br />

1,423 seconds. In addition, we can also leverage some special<br />

hardware, such as the Elliptic Semiconductor CLP-17 [21],<br />

to further reduce the search time on multi-dimensional range<br />

search <strong>over</strong> encrypted cloud data.<br />

VII. RELATED WORK<br />

<strong>Search</strong>able Encryption. <strong>Search</strong>able Encryption schemes can<br />

be divided into two categories, symmetric key based and<br />

public key based. The first practical symmetric key based<br />

keyword search solution for encrypted text documents was<br />

proposed by Song et al. [2], which has linear complexity with<br />

regard to the document length. Since then, the quest for higher<br />

efficiency and better functionality has never stopped. Various<br />

schemes has improved the efficiency of keyword search by<br />

creating an encrypted searchable index such like an inverted<br />

index [4], [5], [7], [22], [23]. However, the above works can<br />

only support single keyword search functionalities. Recently<br />

Cao et al. [24] proposed a multi-keyword ranked search<br />

scheme <strong>over</strong> encrypted cloud data. Kamara et al. focused on<br />

enabling privacy-preserving dynamic updates for SSE [13].<br />

However, the search complexity is still linear with the number<br />

of documents. More<strong>over</strong>, the above works mostly focus on<br />

document search and does not apply to searches on databases<br />

(may contain numerical values) considered in this paper.<br />

On the other hand, the first public key based keyword<br />

searchable encryption scheme was proposed by Boneh et al.<br />

[3]. To support more complex queries, conjunctive keyword<br />

search solutions <strong>over</strong> encrypted data have been proposed [25],<br />

[26]. They are based on bilinear pairings, while the search<br />

needs to touch every data record. Bellare et al. [27] proposed<br />

an efficient deterministic searchable public key encryption<br />

scheme. Logarithmic search complexity can be achieved;<br />

however, it only allows equality test. In addition, due to its<br />

deterministic property, plaintext frequency will be exposed due<br />

to duplicate attributes.<br />

Predicate Encryption. Predicate Encryption is a promising<br />

approach to achieve more flexible search functionalities [28],<br />

[29]. In general, given a ciphertext encrypting under an<br />

attribute vector A, a predicate function f(), the encryption<br />

enables a user who holds the corresponding token to f() to<br />

test whether f(A) = 1 without decrypting the data.<br />

Boneh et al. proposed a public key based hidden vector<br />

encryption scheme [6], which supports conjunctive subset,<br />

equality and range queries. However, the complexity of range<br />

search per data record is O(TD) (linear with the range size<br />

T ), which is far from efficient when the range is large. Later<br />

Katz et al. [30] proposed another public key based predicate<br />

encryption scheme that supports inner products. Theoretically,


equality, subset, conjunctive, disjunctive predicates can all be<br />

expressed in the form of inner-products. However, it incurs an<br />

exponential blowup for handling arbitrary boolean expressions<br />

(containing both CNF/DNF [29]).<br />

Shi et al. [8] proposed an MRPE by exploiting an efficient<br />

tree-based range representation, the per data record search<br />

complexity is reduced to O(DlogT). Later, Shen et al. [31]<br />

found the predicate encryption in [8] may inherently reveal<br />

the query predicate within a token, and proposed a symmetrickey<br />

inner product encryption scheme to solve it. However, it<br />

does not support efficient multi-dimensional range search and<br />

the symmetric-key based approach is not suitable for multiowner<br />

settings. To enable multi-dimensional range query with<br />

predicate privacy, Li et al. [32] proposed a public key based<br />

encrypted search scheme, by introducing an additional trusted<br />

proxy that keeps a secret proxy key. However, it lacks support<br />

for efficient arbitrary range queries. Finally, none of the above<br />

schemes achieve sublinear time search complexity with regard<br />

to the dataset size.<br />

VIII. CONCLUSION<br />

In this paper, we address the problem of efficient and<br />

privacy-preserving search <strong>over</strong> encrypted databases outsourced<br />

to the cloud. We propose a suite of MDRSE for handling multidimensional<br />

range queries <strong>over</strong> encrypted data within sublinear<br />

search time. Through a novel combination of predicate encryption<br />

primitives with efficient multi-dimensional data structures<br />

(such as kd-trees and range trees), efficient MDRSE can be<br />

realized. Our approach is general, in that it can be easily<br />

integrated with different multi-dimensional data structures to<br />

achieve different tradeoffs. We carry out extensive experiments<br />

on a large-scale real-world dataset and show the scalability and<br />

efficiency of our schemes.<br />

REFERENCES<br />

[1] M. Armbrust, A. Fox, R. Griffith, A. D.Joseph, R. H.Katz, A. Konwinski,<br />

G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “A View<br />

of Cloud Computing,” Communications of the ACM, vol. 53, no. 4, pp.<br />

50–58, Apirl 2010.<br />

[2] D. Song, D. Wagner, and A. Perrig, “Practical Techniques for <strong>Search</strong>es<br />

on Encrypted Data,” in Proc. of IEEE S&P’00, 2000.<br />

[3] D. Boneh, G. D. Crescenzo, R. Ostrovsky, and G. Persiano, “Public Key<br />

Encryption with Keyword <strong>Search</strong>,” in Proc. of EUROCRYP’04, 2004,<br />

pp. 506–522.<br />

[4] E.-J. Goh, “Secure Indexes,” Cryptology ePrint Archive, Report<br />

2003/216, 2003, http://eprint.iacr.org/.<br />

[5] R. Curtmola, J. A. Garay, S. Kamara, and R. Ostrovsky, “<strong>Search</strong>able<br />

Symmetric Encryption: Improved Definitions and Efficient Constructions,”<br />

in Proc. of ACM CCS’06, 2006.<br />

[6] D. Boneh and B. Waters, “Conjunctive, Subset, and <strong>Range</strong> Queries on<br />

Encrypted Data,” in Proc. of TCC’07, 2007, pp. 535–554.<br />

[7] C. Wang, N. Cao, J. Li, K. Ren, and W. Lou, “Secure Ranked Keyword<br />

<strong>Search</strong> <strong>over</strong> Encrypted Cloud Data,” in Proc. of ICDCS’10, 2010.<br />

[8] E. Shi, J. Bethencourt, T.-H. H. Chan, D. Song, and A. Perrig, “<strong>Multi</strong>-<br />

<strong>Dimensional</strong> <strong>Range</strong> Query <strong>over</strong> Encrypted Data,” in Proc. of IEEE<br />

S&P’07, 2007, pp. 350–364.<br />

[9] Y. Lu, “<strong>Privacy</strong>-<strong>Preserving</strong> Logarithmic-time <strong>Search</strong> on Encrypted Data<br />

in Cloud,” in Proc. of NDSS’12, Feb. 2012.<br />

[10] J. L. Bentley, “<strong>Multi</strong>dimensional Binary <strong>Search</strong> Trees Used for Associative<br />

<strong>Search</strong>ing,” Communications of the ACM, vol. 18, no. 9, pp.<br />

509–517, 1975.<br />

[11] Mark de Berg and Otfried Cheong and Marc van Kreveld and Mark<br />

Overmars, Computational Geometry: Algorithms and Applications.<br />

Springer-Verlag, 2008.<br />

[12] J. L. Bentley, “Decomposable <strong>Search</strong>ing Problems,” Information Processing<br />

Letters, vol. 8, no. 5, pp. 201–244, 1979.<br />

[13] S. Kamara, C. Papamanthou, and T. Roeder, “Dynamic <strong>Search</strong>able<br />

Symmetric Encryption,” in Proc. of ACM CCS’12, 2012, pp. 965–976.<br />

[14] D. Boneh, G. D. Crescenzo, R. Ostrovsky, and G. Persiano, “Public<br />

Key Encryption with Keyword <strong>Search</strong>,” in Proc. of EUROCRYPT’04.<br />

Springer-Verlag, 2004, pp. 506–522.<br />

[15] M. Islam, M. Kuzu, and M. Kantarcioglu, “Access Pattern Disclosure on<br />

<strong>Search</strong>able Encryption: Ramification, Attack and Mitigation,” in Proc.<br />

of NDSS’12, 2012.<br />

[16] X. Boyen and B. Waters, “Anonymous Hierarchical Identity-Based<br />

Encryption (Without Random Oracles),” in Proc. of CRYPTO’06.<br />

Springer-Verlag, 2006, pp. 290–307.<br />

[17] Dan Boneh and Amit Sahai and Brent Waters, “Functional Encryption:<br />

A New Vision for Public Key Cryptography,” Communications of the<br />

ACM, vol. 55, no. 11, pp. 56–64, 2012.<br />

[18] A. Guttman, “R-Trees A Dynamic Index Structure for Spatial <strong>Search</strong>ing,”<br />

in Proc. of ACM SIGMOD’84, 1984.<br />

[19] R. Finkel and J. L. Bentley, “Quad Trees: A Data Structure for Retrieval<br />

on Composite Keys,” ACTA Informatica, vol. 4, no. 1, pp. 1–9, 1974.<br />

[20] D. F. Aranha, K. Karabina, P. Longa, C. H. Gebotys, and J. Lopez,<br />

“Faster Explicit Formulas for Computing Pairings <strong>over</strong> Ordinary<br />

Curves,” in Proc. of EUROCRYPT’11, 2011, pp. 48–68.<br />

[21] “The Elliptic Semiconductor (CLP-17): high performance<br />

elliptic curve cryptography point multiplier core,”<br />

http://www.ellipticsemi.com/pdf/CLP-17 90401.pdf.<br />

[22] Y.-C. Chang and M. Mitzenmacher, “<strong>Privacy</strong> <strong>Preserving</strong> Keyword<br />

<strong>Search</strong>es on Remote Encrypted Data,” in Proc. of ACNS’05, 2005, pp.<br />

442–455.<br />

[23] Z. Yang, S. Zhong, and R. N. Wright, “<strong>Privacy</strong>-<strong>Preserving</strong> Queries on<br />

Encrypted Data,” in Proc. of ESORICS’06, 2006, pp. 479–495.<br />

[24] N. Cao, C. Wang, M. Li, K. Ren, and W. Lou, “<strong>Privacy</strong>-<strong>Preserving</strong><br />

<strong>Multi</strong>-keyword Ranked <strong>Search</strong> <strong>over</strong> Encrypted Cloud Data,” in Proc. of<br />

IEEE INFOCOM’11, 2011.<br />

[25] P. Golle, J. Staddon, and B. Waters, “Secure Conjunctive Keyword<br />

<strong>Search</strong> <strong>over</strong> Encrypted Data,” in Proc. of ACNS’04. Springer-Verlag,<br />

2004, pp. 31–45.<br />

[26] L. Ballard, S. Kamara, and F. Monrose, “Achieving Efficient Conjunctive<br />

Keyword <strong>Search</strong>es <strong>over</strong> Encrypted Data,” Information and Communications<br />

Security, pp. 414–426, 2005.<br />

[27] M. Bellare, A. Boldyreva, and A. O’Neill, “Deterministic and Efficiently<br />

<strong>Search</strong>able Encryption,” in Proc. of CRYPTO’07, 2007, pp. 535–552.<br />

[28] T. Okamoto and K. Takashima, “Hierarchical Predicate Encryption for<br />

Inner-Products,” in Proc. of ASIACRYPT’09, 2009, pp. 214–231.<br />

[29] A. Lewko, T. Okamoto, A. Sahai, K. Takashima, and B. Waters,<br />

“Fully Secure Functional Encryption: Attribute-Based Encryption and<br />

(Hierarchical) Inner Product Encryption,” in Proc. of EUROCRYPT’10,<br />

2010, pp. 62–91.<br />

[30] J. Katz, A. Sahai, and B. Waters, “Predicate Encryption Supporting<br />

Disjunctions, Polynomial Equations, and Inner Products,” in Proc. of<br />

EUROCRYPT’08, 2008, pp. 146–162.<br />

[31] E. Shen, E. Shi, and B. Waters, “Predicate <strong>Privacy</strong> in Encryption<br />

Systems,” in Proc. of TCC’09, 2009, pp. 457–473.<br />

[32] M. Li, S. Yu, N. Cao, and W. Lou, “Authorized Private Keyword <strong>Search</strong><br />

<strong>over</strong> Encrypted Data in Cloud Computing,” in Proc. of IEEE ICDCS’11,<br />

2011, pp. 383–392.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!