13.07.2015 Views

System Architecture Design

System Architecture Design

System Architecture Design

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

pSHIELD<strong>System</strong> <strong>Architecture</strong> <strong>Design</strong>PUThis intrinsic SPD capabilities can be configured by the pSHIELD Overlay and composed by the pSHIELDMiddleware Core SPD Services, however they apply autonomously and continuously in the pSHIELDNode.6.1.1.2.1 DependabilityDependability is mainly assured by the Dependability block, the SPD Node Status block and by all thefunctionalities embedded in the other blocks, such as those described in the Node pSHIELD SpecificComponent, providing the status of the module, checkpointing information, self-test and rollback-recovery.If any error is detected by any of the modules, the Dependability block triggers system recovery. TheDependability block itself must be able to detect a self-failure, by having a redundant component, such asa watchdog timer that starts system recovery.Other modules also provide other aspects of dependability, such as the Power Management (powerfailures).There are thus several levels of dependability:• Each module of the pSHIELD Node Adapter has an internal Health Status Module (HSM) thatmonitors its health and periodically sends health status information to the SPD Node Status. TheSPD Node Status sends a periodic Heartbeat containing the global layer status to theDependability module• On error, the HSM may inhibit the monitored module, performing a fail-fast operation. If the SPDNode Status stops receiving status information from one of the HSM or receives error informationor even the information itself is erroneous, it stops the Heartbit. On timeout, the Dependabilitymodule starts a recovery procedure• The SPD Node Status may also perform other health status monitoring operations, such asperforming a Power-On Self-Test• Dependability module itself sends a health status information to the SPD Node Status• If the Dependability module fails, the SPD Node Status halts the system• On permanent failure of one of the modules, the Dependability module may halt the system• The Power Management assures system availability by managing redundant power sources ortriggering a low-power mode if power level is low• The Dependability module contains a Stable Storage, assuring data survivability for rollbackrecoveryThe pSHIELD power node may exhibit advanced recovery and reconfigurability capabilities through partialFPGA reconfiguration 10 . Recent advances in FPGA technology offer the possibility of repairing a failedmodule by reloading the bit stream in the FPGA frames that contained this module 11 . Furthermore, thisFPGA reconfiguration may be used for changing the device functionality during runtime.10“In-Circuit Partial Reconfiguration of RocketIO Attributes”,http://www.xilinx.com/support/documentation/application_notes/xapp662.pdf“Two flows for Partial Reconfiguration: Module Based or Difference Based”,http://www.xilinx.com/support/documentation/application_notes/xapp290.pdf“Dynamic Reconfiguration of RocketIO MGT Attributes”,http://www.xilinx.com/support/documentation/application_notes/xapp660.pdf11Cheatham (portal.acm.org/citation.cfm?id=1142167)PUD2.3.2Issue 5 Page 73 of 122

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!