12.07.2015 Views

Unreliable Failure Detectors for Reliable Distributed Systems

Unreliable Failure Detectors for Reliable Distributed Systems

Unreliable Failure Detectors for Reliable Distributed Systems

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Unreliable</strong> <strong>Failure</strong> <strong>Detectors</strong> <strong>for</strong> <strong>Reliable</strong> <strong>Distributed</strong> <strong>Systems</strong> 267GOPAL, A., STRONG,R., TOUEG, S., AND CRISTIAN,F. 1990. Early-delivery atomic broadcast. InProceedings of the 9th ACM Symposium on Principles of <strong>Distributed</strong> Computing (Quebec City, Que,,Canada, Aug. 22-24). ACM, New York, pp. 297-310.GUERRAOUI,R. 1995. Revisiting the relationship between non blocking atomic commitment andconsensus. [n Proceedings of the 9th International Workshop on <strong>Distributed</strong> Algorithms (Sept.).Springer-Verlag, New York, pp. 87-100.HADZILACOS, V.,AND TOUEG, S, 1993. Fault-tolerant broadcasts and related problems. In <strong>Distributed</strong>Sysrerns, Chap. 5, S. J, MULLENDER,Ed,, Addison-Wesley, Reading, Mass., pp. 97–145,HADZILACOS, V,, AND TOUEG, S. 1994. A modular approach to fault-tolerant broadcasts andrelated problems. Tech. Rep. 94-1425 (May), Computer Science Department, Cornell University,Ithaca. NY. Available by anonymous ftp from ftp://ftp.db.toronto. edu/pub/vassos/fault. tolerant.broadcasts. dvi.Z. (An earlier version is also available in Hadzilacos and Toueg [1993]),HALPERN, J. Y., AND MOSES, Y. 1990. Knowledge and common knowledge in a distributedenvironment. J. ACM 37, 3 (July), 549–587,LAMPORT, L. 1978. The implementation of reliable distributed multiprocess systems. Comput.Netw, 2, 95-114.LAMPORT, L., SHOSTAK,R., AND PEASE, M. 1982. The Byzantine generals problem. ACM Trans.Prog. Lang. Syst, 4, 3 (July), 382-401.Lo, W, K., AND HADZILACOS,V. 1994. Using failure detectors to solve consensus in asynchronousshared-memory systems. In Proceedings of the 8th International Workshop on <strong>Distributed</strong> Algon”thms(Sept.), Springer-Verlag, New York, pp. 280-295. Available from ftp://ftp,db.toronto. edu/pub/vassos/failure. detectors. shared. memory, ps.Z.LGUI, M., AND ABU-AMARA. 1987. Memory requirements <strong>for</strong> agreement among unreliable asynchronousprocesses, Adv. Compur. Res. 4, 163–1 83.MOSES,Y., DOLEV, D., AND HALPERN, J. Y. 1986. Cheating husbands and other stories: a casestudy of knowledge, action, and communication. Dishib. Compur. 1, 3, 167-176,MULLENDER.S. J., ED. 1987. The Amoeba <strong>Distributed</strong> Operating System. Seiected papers 1984-1987.Centre <strong>for</strong> Mathematics and Computer Science.NEIGER, G. 1995. <strong>Failure</strong> detectors and the wait-free hierarchy. In Proceedings of the 14th ACMSymposium on Principles of Diswibuted Computing (Ottawa, Ont. Canada, Aug.). ACM, New York,pp. 10(-109.NEIGER, G., AND TOUEG, S. 1990. Automatically increasing the fault-tolerance of distributedalgorithms. J. Algon”thms 11, 3 (Sept.), 374–419.PEASE,M., SHOSTAK,R., AND LAMPORT, L. 1980. Reaching agreement in the presence of faults. J.ACM 27, 2 (Apr.), 228-234.PETERSON,L. L., BUCHOLZ, N. C., AND SCHLICHTING,R. D. 1989. Preserving and using contextin<strong>for</strong>mation in interprocess communication. ACM Trans. Comput. Syst. 7, 3 (Aug.), 217–246.PIITELLI, F., ANDGARCIA-M•LINA,H. 1989. <strong>Reliable</strong> scheduling in a tmr database system. ACMTrans. Compur. Syst. 7, 1 (Feb.), 25-60.POWELL, D., ED. 1991. Delta-4: A Generic Architecture <strong>for</strong> Dependable <strong>Distributed</strong> Computing.Springer-Verlag, New York.REISCHUK,R. 1982, A new solution <strong>for</strong> the Byzantine general’s problem. Tech. Rep. RJ 3673(Nov.), IBM Research Laboratory, Thomas J, Watson Research Center, Hawthorne, N.Y,RICCIARDI, A,, ANDBIRMAN,K. P. 1991. Using process groups to implement failure detection inasynchronous environment ts. Jn Proceedings of the IOth ACM Symposium on Principles of DrktnbutedComputing (Montreal, Que., Canada, Aug. 19-21). ACM, New York, pp. 341-354.SABEL, L,, ANDMARZULLO,K. 1995. Election vs. consensus in asynchronous systems. Tech. Rep.TR95-411 (Feb.). Univ. Cali<strong>for</strong>nia at San Diego. San Diego, Calif. Available at ftp://ftp.cs.cornell.edu/pub/sabel/tr94-1413.ps.SCHNE]D~R,F. B. 1990. Implementing fault-tolerant services using the state machine approach: Atutorial. ACM Cornput. Surv. 22, 4 (Dec.), 299–319.WENSLEY, J. H., LAMPORT, L., GOLDBERG,J., GREEN, M. W., LEVITT, K. N., MELLIAR-SMITH, P.,SHOSTAK, R. E., AND WEINSTOCK, C. B. 1978. SIFT Design and analysis of a fault-tolerantcomputer <strong>for</strong> aircraft control. Proc. IEEE 66, 10 (Oct.), 1240–1255.RECEIVEDJULY 1993; REVISEDMARCH1995; ACCEPTEDOCTOBER1995Journal of the ACM, Vol. 43, No 2, March 1996.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!