14.01.2014 Views

1 Quorum System

1 Quorum System

1 Quorum System

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Com S 611 Spring Semester 2013<br />

Advanced topics on Networks and Distributed Algorithms<br />

Lecture 13: Monday, March 4, 2013<br />

Instructor: Soma Chaudhuri<br />

Scribe: Debasis Mandal<br />

We are going to start a new topic today: <strong>Quorum</strong> system. The lexical meaning of the word<br />

<strong>Quorum</strong> is “the minimum number of members of a group that needs to be present at any<br />

of its meetings to make any decision for the group”. For example, majority is an example<br />

of a quorum for voting in a group. Majority is required in the voting system to avoid the<br />

partitioning and inconsistency in the decision process. But in practice, we may require even<br />

2/3rd majority for a decision to be accepted by the group.<br />

1 <strong>Quorum</strong> <strong>System</strong><br />

In distributed computing, quorum usually means a collection of pairwise non-empty subsets<br />

of nodes in the network, which is large enough to make a decision. Intuitively, a majority is<br />

sufficient to prevent inconsistencies, for given a set S and A, B ⊆ S, if |A|, |B| > |S|/2, it<br />

always holds that A ∩ B ≠ φ. More formally,<br />

Definition 1 A quorum system is a collection of subsets of nodes, called quorums, such<br />

that each pair of quorums have a non-empty intersection.<br />

1.1 Properties<br />

• <strong>Quorum</strong> system is a mathematical abstraction for guaranteeing consistency in faulttolerant<br />

systems. Its goal is to maintain consistency with minimum number of nodes.<br />

For example, in a read/write distributed storage system of 3 processes, writing a value<br />

in any two processes (majority) guarantees that any later read by any two processes<br />

will return a consistent value.<br />

• <strong>Quorum</strong>s are critical for many applications in distributed computing, mainly where we<br />

want to avoid network partitioning.<br />

1.2 Applications<br />

<strong>Quorum</strong> systems have been used to implement a wide variety of distributed objects and<br />

services, for example,<br />

1


• replicated databases,<br />

• mutual exclusion,<br />

• read/write storage, and<br />

• group communication.<br />

We’ll cover classical quorum systems in the class and evaluate the quorum systems using<br />

various measures. Specifically, we’ll look into following two applications of quorum systems:<br />

1. distributed read/write storage (Lamport’s register) and<br />

2. consensus.<br />

1.2.1 Replicated Databases<br />

A major goal of the Replicated databases is to ensure consistency in the context of failures,<br />

and <strong>Quorum</strong> system was first used to implement them by [Thomas, 1979]. He proposed a<br />

majority approach to achieve the consensus in order to maintain the concurrency control<br />

over multiple copies of a replicated database. The majority approach works as follows. To<br />

write data into database, the writer would timestamp the data (Lamport timestamp) and<br />

write it to a majority of servers. Then to read data from the database, the reader would<br />

contact a majority (possibly different) and return the data with the highest timestamp.<br />

Later, arbitrary quorum sizes (not just majority) were also allowed, but all of them required<br />

the non-empty pairwise intersections between quorums (the main reason behind consistency<br />

of <strong>Quorum</strong> system). In fact, separate read and write quorums are also studied where only<br />

quorum of different classes need to intersect (for example, among read and write quorums,<br />

but not among read quorums).<br />

1.3 <strong>System</strong> Model<br />

Let S = {s 1 , s 2 , . . . , s n } be the set of n processes that constitute the distributed system.<br />

Each process is running its own protocol. For now, we assume a fixed set of processes, but<br />

this model can be extended to deal with system with dynamic process membership, where<br />

processes join or leave the system (and hence S is not fixed).<br />

Definition 2 A process is a state machine (or I/O automaton), which can be in various<br />

states following the various process actions (called transitions), as defined below. Every<br />

process has input, internal, and output actions, defined by in(s), int(s), out(s) respectively,<br />

for a given process s. External actions to a process are either the input to it (recv) or output<br />

to other process (send). More formally,<br />

2


1. ext(s) = in(s) ∪ out(s)<br />

2. local(s) = int(s) ∪ out(s).<br />

Definition 3 An execution is a (possibly infinite) sequence of alternating global states and<br />

process actions. More formally, e = st 0 , π 1 , st 1 , π 2 , . . ., where e is the execution, π i is the<br />

process action, and st i is the global state, such that (st i , π i+1 , st i+1 ) represents one step of a<br />

system. Furthermore, action π i+1 is enabled in global state st i .<br />

Definition 4 A partial execution is a finite prefix of some execution. A (partial) execution<br />

e extends a partial execution e ′ if e ′ is a prefix of e.<br />

1.4 Communication models<br />

As typical in distributed computing, we’ll consider two types of communication models.<br />

1.4.1 Message passing model<br />

Each message is delivered point-to-point in this model and each send and recv message is<br />

atomic, but a broadcast is implemented by a sequence of send actions (hence, not atomic).<br />

A complete bi-directional network with links between every pair of processes is also assumed.<br />

If {s 1 , s 2 , . . . , s n } is the set of automata and {l ij : s i , s j are processes} is the set of channels<br />

between processes, then the set {s i : ∀i, 1 ≤ i ≤ n} ∪ {l ij : s i , s j are processes} constitute<br />

the global automaton of the system, and state of each of its elements at some time is the<br />

global state of the automaton at that time.<br />

1.4.2 Shared memory model<br />

In this model, processes communicate through operations on shared objects. Suppose {s 1 , s 2 , . . . , s n }<br />

are the processes, and {O 1 , O 2 , . . . , O m } are the operations on shared object i. Then following<br />

are the atomic operations in this model:<br />

• inv s (O, op, v): process s invokes op on O (output actions on s, input action O), and<br />

• resp s (O, op, v): process s receives response (input action on s, output action of O),<br />

where v is the value returned by the operation op.<br />

Definition 5 An operation is complete in an execution if its invocation has corresponding<br />

matching response, and pending if its invocation has no response.<br />

3


1.5 Process failures<br />

Following are the usual kind of failures that typically occur by object/process in both shared<br />

memory and message-passing systems.<br />

Definition 6 If a process behaves correctly according to its protocol, then it’s correct. A<br />

crash failure is one where the process (or shared object) stops executing protocol permanently.<br />

A process is benign if it’s correct or it has a crash failure. A process that is not benign is<br />

called Byzantine or malicious.<br />

Byzantine failures are the worst kind of failures. They can be again of two types:<br />

1. Unauthenticated: Here, a process can pretend to be some other process and can possibly<br />

forge signature of others. Processes can send arbitrary messages in message passing<br />

model or invoke arbitrary operations in shared memory model.<br />

2. Authenticated: We assume digital signature of each process in this type of failure of<br />

Byzantine failure and thus no process can forge other process’s signature.<br />

Definition 7 A fault configuration is a vector C ∈ {0, 1} n such that C i = 1 if and only if<br />

the process s i has failed.<br />

Definition 8 Given a set of processes S and an execution e, we define alive(e, S) as the set<br />

of correct processes in S, and faulty(e, S) as the set of faulty processes in S.<br />

We’ll write them as alive(S) and faulty(S), when e is clear from the context.<br />

Definition 9 A set of processes Q ⊆ S is available if Q ⊆ alive(S).<br />

We also consider probabilistic fault-tolerant model, where each process s i in the set S fails<br />

independently with probability p i .<br />

Definition 10 If p i = p for all i, then it’s called a uniform probabilistic fault tolerant model.<br />

References<br />

[Thomas, 1979] Thomas, R. H. (1979). A majority consensus approach to concurrency control<br />

for multiple copy databases. ACM Trans. Database Syst., 4(2):180–209.<br />

4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!