Practical ICT Experience – Flexibility – Worldwide References
Practical ICT Experience – Flexibility – Worldwide References - Ericpol
Practical ICT Experience – Flexibility – Worldwide References - Ericpol
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Knowledge Evolution<br />
Safety Critical & High Availability Systems Masterclass<br />
Course ID: EPOL-10:026<br />
Duration: 3 days<br />
Number of participants: recommended optimum 15, maximum 25<br />
Course objectives<br />
The primary goal of this Masterclass is to give the participant the skills necessary to design systems<br />
and software for real-time and embedded computers in which faults and failures could pose a danger to<br />
human life. As part of this, participants gain skills in designing systems for high availability. This is very<br />
practical, results-oriented training that provides knowledge and skills that can be applied immediately.<br />
This Masterclass examines the design of embedded systems and software that are to provide services<br />
in applications that could, when they fail, threaten the well-being or safety of people. Many, though not<br />
all, of these systems must not be stopped under any circumstances, and thus must be designed for high<br />
availability. <strong>Practical</strong> guidance is offered on how to address these concerns when designing systems<br />
in fields such as medical, automotive, avionics, nuclear and chemical process control.<br />
The Masterclass surveys concepts and alternatives for system and software architectures appropriate for<br />
safety-critical and high availability systems. Following an examination of hazard and risk analysis<br />
techniques, the seminar goes on to list a number of approaches to software safety that span fault avoidance,<br />
fault detection, and fault containment tactics including redundancy, recovery, masking<br />
and barriers. A variety of candidate architectural design patterns are examined, including dual/triple<br />
modular redundancy, shutdown monitors, dissimilar independent designs, backup parallel patterns<br />
and active/monitor parallel patterns. Many real-world examples are presented.<br />
Systems which are required to provide high availability must be designed to tolerate faults. Their design<br />
is usually based on off-the-shelf hardware and software combined in ways that will achieve “five-nines”<br />
(99.999%) or greater availability. Basic hardware N-plexing and voting issues are discussed, followed<br />
by an in-depth study of a number of backward error recovery fault tolerance techniques including<br />
Checkpoint-Rollback, Process Pairs, and Recovery Blocks. The class continues with several forward error<br />
recovery techniques. Software design approaches are discussed for run-time Built-In Self Test (BIST) of<br />
processor and peripheral hardware. Technical issues such as failover management, data replication,<br />
and software design defects, are addressed in depth.<br />
V/21<br />
ericpol.com