262 T. D. CHANDRAAND S. TOUEGfailure detectors can be grouped into four classes according to the actualaccuracy property that they satisfy:$fW(k): the class of Strong@ lc-Mists/cen failure detectors,Y’S: the class of Strongly Finitely Mistaken failure detectors,W g(k): the class of WeaMy k-MMaken failure detectors, andW%: the class of Weakly Finite~ Mistaken failure detectors.Clearly, 9’9(0) > 9W(1) > “. .2 !Y~(k) > Y%(k + 1) >. .-2 9’9. Asimilar order holds <strong>for</strong> the W%. Consider a system of n processes of which atmost f may crash. In this system, there are at least n – f correct processes. Sinceany failure detector !3 E 9’%( (n – f ) – 1) makes fewer mistakes thanthe number of correct processes, there is at least one correct process that$3 never suspects. Thus, $3 is also weakly O-mistaken, and we conclude thatY%((n – f) – 1) > WW(0). Furthermore, it is clear that $% > W~.These classes of repentant failure detectors can be ordered by reducibility intoan infinite hierarchy, which is illustrated in Figure Al (an edge - represents the5 relation). Each failure detector class defined in Section 2.4 is equivalent tosome class in this hierarchy. In particular, it is easy to show that:For example, it is easy to see that the algorithm in Figure 3 trans<strong>for</strong>ms anyfailure detector in W9 into one in OW. Other conversions are similar orstraight<strong>for</strong>ward and are there<strong>for</strong>e omitted. Note that V and OW are the strongestand weakest failure detector classes in this hierarchy, respectively. From Corollaries6.1.9 and 7.1.8, and Observation A2.1, we have:COROLLARY A2.2. Consensus and Atomic Broadcast are solvable using %%(0)in asynchronous systems with f < n.Similarly, from Corollaries 6.2.4 and 7.1.8, and Observation A2. 1, we have:COROLLARY A2.3. Consensus and Atomic Broadcast are solvable using W% inasynchronous systems with f < (n/21.A3. Tight Bouna3 on Fault-ToleranceSince Consensus and Atomic Broadcast are equivalent in asynchronous systemswith any number of faulty processes (Corollary 7.1.7), we can focus on establishingfault-tolerance bounds <strong>for</strong> Consensus. In Section 6, we showed that failuredetectors with perpetual accuracy (i.e., in 9, $2.,Y’, or W) can be used to solveConsensus in asynchronous systems with any number of failures. In contrast, withfailure detectors with eventual accuracy (i.e., in 0!7’, 09, OY, or OW), Consensuscan be solved if and only if a majority of the processes are correct. We now refinethis result by considering each failure detector class % in our infinite hierarchy,and determining how many correct processes are necessary to solve Consensususing %. The results are illustrated in Figure Al.
<strong>Unreliable</strong> <strong>Failure</strong> <strong>Detectors</strong> <strong>for</strong> <strong>Reliable</strong> <strong>Distributed</strong> <strong>Systems</strong> 263.YW(0)~ ~ E $??(atron.srnt) .....Con*nsuasolvablef#~(l) .. ...Consensus solvable ifT f < n<strong>for</strong> all J < n\Yq(2).....Conaensus solvable ifl ~ < n - I‘,WWltJ - 1)-c0n*n8us ~l=ble ifl f < ~~1+‘2Consensusnlvable<strong>for</strong>all~ n – f thenConsensus cannot be solved using Y%(m).PROOF (SKETCH). Consider an asynchronous system with f a [n/21 andassume m > n – f. We show that there is a failure detector 9 G ~~(m ) suchthat no algorithm solves Consensus using 9. We do so by describing the behaviorof a Strongly m-Mistaken failure detector Q such that <strong>for</strong> every algorithm A,there is a run R~ of A usin 9 that violates the specification of Consensus.Since 1 s n – ~ s m/2 7, we can partition the processes into three sets 110,II, and IIc,a,h,d, such that HO and 111 are non-empty sets containing n – fprocesses each, and I&h=d is a (possibly empty) set containing the remainingn – 2(n – f) processes. Hence<strong>for</strong>th, we only consider runs in which allprocesses in IIC,a,A,d crash at the beginning of the run. Let go E llo andq ~ c II ~. Consider the following two runs of A using $2:Run RO = (FO, Ho, l., SO, TO). All processes propose O. All processesin 110 are correct in FO, while all the f processes in HI U llcr.$h=~ crash in