12.07.2015 Views

Unreliable Failure Detectors for Reliable Distributed Systems

Unreliable Failure Detectors for Reliable Distributed Systems

Unreliable Failure Detectors for Reliable Distributed Systems

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

236 T. D. CHANDRA AND S. TOUEGEvery process p exwutes the followingOutputp+ 0cobeginII Task f: repeat <strong>for</strong>ever{p queries its load jailure detector module S3,]Suspectsp 6 3Psend (p, suspectsP) to allII Task 2: when receive (g, Suspectsq) <strong>for</strong> some qOutputp4- (Outputpu suspedsq) - {9} {output, emulates 91}coencfFIG.3. Ta-g ~:From Weak Completeness to Strong Completeness.accuracy properties that we defined in Section 2.3 then 9‘ also does so. In otherwords, Tq+q, strengthens completeness while preseming accuracy.This result allows us to focus on the four classes of failure detectors defined inthe first row of Figure 1, that is, those with strong completeness. This is because,T~+a, (together with Observation 2.6.1) shows that every failure detector classin the second row of Figure 1 is actually equivalent to the class above it in thatfigure.In<strong>for</strong>mally, Tq+Q, works as follows: Every process p periodically sends(p, suspectsP)—where suspectsP denotes the set of processes that p suspectsaccording to its local failure detector module !31P-to every process. When preceives a message of the <strong>for</strong>m (q, suspectsg), itadds suspectsq to outputP andremoves q from outputP (recall that outputP is the variable emulating the outputof the failure detector module !?il~).In our algorithms, we use the notation “send m to all” as a short-hand <strong>for</strong> “<strong>for</strong>all q E II: send m to q.” If a process p crashes while executing this “<strong>for</strong> loop”, itis possible that some processes receive the message m while others do not.Let R = (F, HQ, 1, S, T) be an arbitrary run of Ta+9, using failure detector‘3. In the following, the run R and its failure pattern F are fixed. Thus, when wesay that a process crashes we mean that it crashes in F. Similarly, when we saythat a process is correct, we mean that it is correct in F. We will show thatoutpu~ satisfies the following properties:PI (Trans<strong>for</strong>ming weak completeness into strong completeness). Let p be anyprocess that crashes. If eventually some correct process permanently suspects p bH3, then eventually ail correct processes permanently suspect p in outpu~. More<strong>for</strong>mally:Vp ~ crashed(F):3 E 9, =q G correct(F), Vt’ a t:p G Hg(q, t’)3 3t E 9, Vq ~ correct(F), Vt’ a t:p E outpu~(q, t’).P2 (Preserving pe~etual accuracy). Let p be any process. If no processsuspects p in Ha be<strong>for</strong>e time t,then no process suspects p in outpu~ be<strong>for</strong>etime t.More <strong>for</strong>mally:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!