1 Quorum System
1 Quorum System
1 Quorum System
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Com S 611 Spring Semester 2013<br />
Advanced topics on Networks and Distributed Algorithms<br />
Lecture 13: Monday, March 4, 2013<br />
Instructor: Soma Chaudhuri<br />
Scribe: Debasis Mandal<br />
We are going to start a new topic today: <strong>Quorum</strong> system. The lexical meaning of the word<br />
<strong>Quorum</strong> is “the minimum number of members of a group that needs to be present at any<br />
of its meetings to make any decision for the group”. For example, majority is an example<br />
of a quorum for voting in a group. Majority is required in the voting system to avoid the<br />
partitioning and inconsistency in the decision process. But in practice, we may require even<br />
2/3rd majority for a decision to be accepted by the group.<br />
1 <strong>Quorum</strong> <strong>System</strong><br />
In distributed computing, quorum usually means a collection of pairwise non-empty subsets<br />
of nodes in the network, which is large enough to make a decision. Intuitively, a majority is<br />
sufficient to prevent inconsistencies, for given a set S and A, B ⊆ S, if |A|, |B| > |S|/2, it<br />
always holds that A ∩ B ≠ φ. More formally,<br />
Definition 1 A quorum system is a collection of subsets of nodes, called quorums, such<br />
that each pair of quorums have a non-empty intersection.<br />
1.1 Properties<br />
• <strong>Quorum</strong> system is a mathematical abstraction for guaranteeing consistency in faulttolerant<br />
systems. Its goal is to maintain consistency with minimum number of nodes.<br />
For example, in a read/write distributed storage system of 3 processes, writing a value<br />
in any two processes (majority) guarantees that any later read by any two processes<br />
will return a consistent value.<br />
• <strong>Quorum</strong>s are critical for many applications in distributed computing, mainly where we<br />
want to avoid network partitioning.<br />
1.2 Applications<br />
<strong>Quorum</strong> systems have been used to implement a wide variety of distributed objects and<br />
services, for example,<br />
1
• replicated databases,<br />
• mutual exclusion,<br />
• read/write storage, and<br />
• group communication.<br />
We’ll cover classical quorum systems in the class and evaluate the quorum systems using<br />
various measures. Specifically, we’ll look into following two applications of quorum systems:<br />
1. distributed read/write storage (Lamport’s register) and<br />
2. consensus.<br />
1.2.1 Replicated Databases<br />
A major goal of the Replicated databases is to ensure consistency in the context of failures,<br />
and <strong>Quorum</strong> system was first used to implement them by [Thomas, 1979]. He proposed a<br />
majority approach to achieve the consensus in order to maintain the concurrency control<br />
over multiple copies of a replicated database. The majority approach works as follows. To<br />
write data into database, the writer would timestamp the data (Lamport timestamp) and<br />
write it to a majority of servers. Then to read data from the database, the reader would<br />
contact a majority (possibly different) and return the data with the highest timestamp.<br />
Later, arbitrary quorum sizes (not just majority) were also allowed, but all of them required<br />
the non-empty pairwise intersections between quorums (the main reason behind consistency<br />
of <strong>Quorum</strong> system). In fact, separate read and write quorums are also studied where only<br />
quorum of different classes need to intersect (for example, among read and write quorums,<br />
but not among read quorums).<br />
1.3 <strong>System</strong> Model<br />
Let S = {s 1 , s 2 , . . . , s n } be the set of n processes that constitute the distributed system.<br />
Each process is running its own protocol. For now, we assume a fixed set of processes, but<br />
this model can be extended to deal with system with dynamic process membership, where<br />
processes join or leave the system (and hence S is not fixed).<br />
Definition 2 A process is a state machine (or I/O automaton), which can be in various<br />
states following the various process actions (called transitions), as defined below. Every<br />
process has input, internal, and output actions, defined by in(s), int(s), out(s) respectively,<br />
for a given process s. External actions to a process are either the input to it (recv) or output<br />
to other process (send). More formally,<br />
2
1. ext(s) = in(s) ∪ out(s)<br />
2. local(s) = int(s) ∪ out(s).<br />
Definition 3 An execution is a (possibly infinite) sequence of alternating global states and<br />
process actions. More formally, e = st 0 , π 1 , st 1 , π 2 , . . ., where e is the execution, π i is the<br />
process action, and st i is the global state, such that (st i , π i+1 , st i+1 ) represents one step of a<br />
system. Furthermore, action π i+1 is enabled in global state st i .<br />
Definition 4 A partial execution is a finite prefix of some execution. A (partial) execution<br />
e extends a partial execution e ′ if e ′ is a prefix of e.<br />
1.4 Communication models<br />
As typical in distributed computing, we’ll consider two types of communication models.<br />
1.4.1 Message passing model<br />
Each message is delivered point-to-point in this model and each send and recv message is<br />
atomic, but a broadcast is implemented by a sequence of send actions (hence, not atomic).<br />
A complete bi-directional network with links between every pair of processes is also assumed.<br />
If {s 1 , s 2 , . . . , s n } is the set of automata and {l ij : s i , s j are processes} is the set of channels<br />
between processes, then the set {s i : ∀i, 1 ≤ i ≤ n} ∪ {l ij : s i , s j are processes} constitute<br />
the global automaton of the system, and state of each of its elements at some time is the<br />
global state of the automaton at that time.<br />
1.4.2 Shared memory model<br />
In this model, processes communicate through operations on shared objects. Suppose {s 1 , s 2 , . . . , s n }<br />
are the processes, and {O 1 , O 2 , . . . , O m } are the operations on shared object i. Then following<br />
are the atomic operations in this model:<br />
• inv s (O, op, v): process s invokes op on O (output actions on s, input action O), and<br />
• resp s (O, op, v): process s receives response (input action on s, output action of O),<br />
where v is the value returned by the operation op.<br />
Definition 5 An operation is complete in an execution if its invocation has corresponding<br />
matching response, and pending if its invocation has no response.<br />
3
1.5 Process failures<br />
Following are the usual kind of failures that typically occur by object/process in both shared<br />
memory and message-passing systems.<br />
Definition 6 If a process behaves correctly according to its protocol, then it’s correct. A<br />
crash failure is one where the process (or shared object) stops executing protocol permanently.<br />
A process is benign if it’s correct or it has a crash failure. A process that is not benign is<br />
called Byzantine or malicious.<br />
Byzantine failures are the worst kind of failures. They can be again of two types:<br />
1. Unauthenticated: Here, a process can pretend to be some other process and can possibly<br />
forge signature of others. Processes can send arbitrary messages in message passing<br />
model or invoke arbitrary operations in shared memory model.<br />
2. Authenticated: We assume digital signature of each process in this type of failure of<br />
Byzantine failure and thus no process can forge other process’s signature.<br />
Definition 7 A fault configuration is a vector C ∈ {0, 1} n such that C i = 1 if and only if<br />
the process s i has failed.<br />
Definition 8 Given a set of processes S and an execution e, we define alive(e, S) as the set<br />
of correct processes in S, and faulty(e, S) as the set of faulty processes in S.<br />
We’ll write them as alive(S) and faulty(S), when e is clear from the context.<br />
Definition 9 A set of processes Q ⊆ S is available if Q ⊆ alive(S).<br />
We also consider probabilistic fault-tolerant model, where each process s i in the set S fails<br />
independently with probability p i .<br />
Definition 10 If p i = p for all i, then it’s called a uniform probabilistic fault tolerant model.<br />
References<br />
[Thomas, 1979] Thomas, R. H. (1979). A majority consensus approach to concurrency control<br />
for multiple copy databases. ACM Trans. Database Syst., 4(2):180–209.<br />
4