11.07.2015 Views

1998 - Draper Laboratory

1998 - Draper Laboratory

1998 - Draper Laboratory

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Advanced Fault-TolerantComputing for FutureManned Space MissionsJaynarayan H. Lala The Charles Stark <strong>Draper</strong> <strong>Laboratory</strong>, Inc.Andrew L. Benjamin NASA Johnson Space CenterBased on the paper published in the Proceedings of the AIAA/IEEE Digital Awareness Systems ConferenceAbstractRecently, the planet Mars has been at the focal point ofastronomical attention. As we enter the next millennium, Marswill play a key role in humanity’s successful expansion intoheliocentric space. Future Mars space transportation will requirereliable operations over a life span of years, unlike the SpaceShuttle, which requires dependable operations over months, orthe Space Station, which is close at hand for maintenancelogistics. Also, unlike other long-life deep space missions,human Mars’ operations require real-time masking of criticalfaults. The longevity associated with deep space missionsdemands innovative fault-tolerant technology development andapplications of advanced redundancy techniques to achieve highsustained ultrareliability for long-life missions. To enable humanMars exploration and meet system demands for increased safety,reliability, and autonomy, this paper presents a technology planto foster the development of the next-generation fault-tolerantcomputing technology. This paper discusses Mars’ stringentbaseline requirements and constraints and presents fault-tolerantapproaches, techniques, and design building block strategies thatinclude standby redundancy, reconfigurable voting, backupsparing, and graceful degradation. The contemporary approachand recognized inadequacies of their application to longdurationspace missions will be discussed. Certain problems willbe identified and viable solutions offered. Various aspects offault-tolerant designs and implementations are discussed,including component selection, radiation tolerance, high-densitypackaging technology, computational integrity, and faultcoverage. Architectural solutions that can make systemsaffordable, such as open systems, standardization, and ease ofvalidation will be highlighted. The paper concludes with atechnology demonstration plan to achieve the desired baselinerequirements and goals.IntroductionAt the NASA Johnson Space Center, a 2004 precursor MarsTransit Habitat flight demonstration is planned to validatelong-life redundancy techniques that reduce life-cycle costs andunscheduled maintenance, and increase safety, reliability, andprobability of mission success. There are many elementsof perceived risks that include low gravity and radiationeffects and long-duration, lightweight, integrated microelectronicsperformance and component reliability. This missionopportunity provides early prototyping of basic long-life faulttolerantdesign building blocks and methodologies that includestandby, reconfigurable voting, backup sparing, and gracefuldegradation. Much work has already been accomplished inredundancy management techniques for space systems thatoperate dependably for months without external maintenance.With the constraints of low mass, power, and cost, arevolutionary fault-tolerant computing technology is required tosurvive human Mars long-life operations. The basic approach toachieving long-term ultrareliability requires intelligent allocationand use of redundant assets. On-chip redundancy usingmicro-electronic Very Large Scale Integrated (VLSI) high-densitypackaging has the potential to revolutionize avionics design.Also, integrated micro-electronic technology appears attractive tolong-life space applications. The demand for operationalredundancy, minimum power consumption, compact size,decreased mass, and improved reliability potential are importantrelevant factors. There are numerous trade-offs to be madebetween system reliability and availability with significant impacton the various metrics of system cost. The advent ofradiation-tolerant tools VLSI design makes reliable fault-tolerantdesigns and implementations feasible and cost-effective. Recentadvances in VLSI and packaging technologies offer thepossibility of building complex digital flight control systems thatneed no maintenance. As part of the New Millennium Program,the Jet Propulsion <strong>Laboratory</strong> (JPL) is currently validating anddemonstrating very high density electronic packagingtechnologies, including Multichip Module (MCM) technology,3-dimensional MCM stacking, and die stacking for memory,which attain many MIPS per cubic inch while minimizing power,weight, and volume.Fault-Tolerant System ConsiderationsMany different fault-tolerant systems have been developed anddemonstrated in the past. Redundancy has been a majorconsideration in flight control systems. In these applications,invalid information could cause a catastrophic failure of thecontrol system and could result in vehicle loss. Traditionally,Advanced Fault-Tolerant Computing for Future Manned Space Missions2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!