20.09.2015 Views

Practical ICT Experience – Flexibility – Worldwide References

Practical ICT Experience – Flexibility – Worldwide References - Ericpol

Practical ICT Experience – Flexibility – Worldwide References - Ericpol

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Knowledge Evolution<br />

Safety Critical & High Availability Systems Masterclass<br />

Course ID: EPOL-10:026<br />

Duration: 3 days<br />

Number of participants: recommended optimum 15, maximum 25<br />

Course objectives<br />

The primary goal of this Masterclass is to give the participant the skills necessary to design systems<br />

and software for real-time and embedded computers in which faults and failures could pose a danger to<br />

human life. As part of this, participants gain skills in designing systems for high availability. This is very<br />

practical, results-oriented training that provides knowledge and skills that can be applied immediately.<br />

This Masterclass examines the design of embedded systems and software that are to provide services<br />

in applications that could, when they fail, threaten the well-being or safety of people. Many, though not<br />

all, of these systems must not be stopped under any circumstances, and thus must be designed for high<br />

availability. <strong>Practical</strong> guidance is offered on how to address these concerns when designing systems<br />

in fields such as medical, automotive, avionics, nuclear and chemical process control.<br />

The Masterclass surveys concepts and alternatives for system and software architectures appropriate for<br />

safety-critical and high availability systems. Following an examination of hazard and risk analysis<br />

techniques, the seminar goes on to list a number of approaches to software safety that span fault avoidance,<br />

fault detection, and fault containment tactics including redundancy, recovery, masking<br />

and barriers. A variety of candidate architectural design patterns are examined, including dual/triple<br />

modular redundancy, shutdown monitors, dissimilar independent designs, backup parallel patterns<br />

and active/monitor parallel patterns. Many real-world examples are presented.<br />

Systems which are required to provide high availability must be designed to tolerate faults. Their design<br />

is usually based on off-the-shelf hardware and software combined in ways that will achieve “five-nines”<br />

(99.999%) or greater availability. Basic hardware N-plexing and voting issues are discussed, followed<br />

by an in-depth study of a number of backward error recovery fault tolerance techniques including<br />

Checkpoint-Rollback, Process Pairs, and Recovery Blocks. The class continues with several forward error<br />

recovery techniques. Software design approaches are discussed for run-time Built-In Self Test (BIST) of<br />

processor and peripheral hardware. Technical issues such as failover management, data replication,<br />

and software design defects, are addressed in depth.<br />

V/21<br />

ericpol.com

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!