12.07.2015 Views

Unreliable Failure Detectors for Reliable Distributed Systems

Unreliable Failure Detectors for Reliable Distributed Systems

Unreliable Failure Detectors for Reliable Distributed Systems

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

262 T. D. CHANDRAAND S. TOUEGfailure detectors can be grouped into four classes according to the actualaccuracy property that they satisfy:$fW(k): the class of Strong@ lc-Mists/cen failure detectors,Y’S: the class of Strongly Finitely Mistaken failure detectors,W g(k): the class of WeaMy k-MMaken failure detectors, andW%: the class of Weakly Finite~ Mistaken failure detectors.Clearly, 9’9(0) > 9W(1) > “. .2 !Y~(k) > Y%(k + 1) >. .-2 9’9. Asimilar order holds <strong>for</strong> the W%. Consider a system of n processes of which atmost f may crash. In this system, there are at least n – f correct processes. Sinceany failure detector !3 E 9’%( (n – f ) – 1) makes fewer mistakes thanthe number of correct processes, there is at least one correct process that$3 never suspects. Thus, $3 is also weakly O-mistaken, and we conclude thatY%((n – f) – 1) > WW(0). Furthermore, it is clear that $% > W~.These classes of repentant failure detectors can be ordered by reducibility intoan infinite hierarchy, which is illustrated in Figure Al (an edge - represents the5 relation). Each failure detector class defined in Section 2.4 is equivalent tosome class in this hierarchy. In particular, it is easy to show that:For example, it is easy to see that the algorithm in Figure 3 trans<strong>for</strong>ms anyfailure detector in W9 into one in OW. Other conversions are similar orstraight<strong>for</strong>ward and are there<strong>for</strong>e omitted. Note that V and OW are the strongestand weakest failure detector classes in this hierarchy, respectively. From Corollaries6.1.9 and 7.1.8, and Observation A2.1, we have:COROLLARY A2.2. Consensus and Atomic Broadcast are solvable using %%(0)in asynchronous systems with f < n.Similarly, from Corollaries 6.2.4 and 7.1.8, and Observation A2. 1, we have:COROLLARY A2.3. Consensus and Atomic Broadcast are solvable using W% inasynchronous systems with f < (n/21.A3. Tight Bouna3 on Fault-ToleranceSince Consensus and Atomic Broadcast are equivalent in asynchronous systemswith any number of faulty processes (Corollary 7.1.7), we can focus on establishingfault-tolerance bounds <strong>for</strong> Consensus. In Section 6, we showed that failuredetectors with perpetual accuracy (i.e., in 9, $2.,Y’, or W) can be used to solveConsensus in asynchronous systems with any number of failures. In contrast, withfailure detectors with eventual accuracy (i.e., in 0!7’, 09, OY, or OW), Consensuscan be solved if and only if a majority of the processes are correct. We now refinethis result by considering each failure detector class % in our infinite hierarchy,and determining how many correct processes are necessary to solve Consensususing %. The results are illustrated in Figure Al.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!