25.07.2014 Views

pdf download - Software and Computer Technology - TU Delft

pdf download - Software and Computer Technology - TU Delft

pdf download - Software and Computer Technology - TU Delft

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

State-of-the-Practice<br />

Fault diagnosis at PMS<br />

2.4 Optimal Fault Diagnosis<br />

where a system is located, <strong>and</strong> start the search for the malfunctioning FRU. Some service engineers<br />

are familiar with checking the log, <strong>and</strong> start their search there. If the service engineer recognizes<br />

the appropriate messages, the TBCB, CRCB <strong>and</strong> the Collimator are the starting point of the search.<br />

From experience the service engineer knows the fuses of these components are the first suspects. So,<br />

these are the first components that are checked. Then the TBCB, CRCB <strong>and</strong> Collimator themselves<br />

are examined. This can be done, because of various status LEDs, that give an indication of their<br />

health. If none of these seems to be the cause, all other components are sequentially checked<br />

(looking at the LEDs, cabling, etc.) for inconsistencies. Eventually, the service engineer will find<br />

out that CableB is not connected well or is broken. The latter can be detected by measurement.<br />

2.3.1 Drawbacks<br />

The described process above shows that finding a simple broken cable takes much time. In the real<br />

case, there are even more cables, fuses <strong>and</strong> components. All of them must be checked, <strong>and</strong> this is<br />

time consuming. Interviews with experts <strong>and</strong> service engineers show that it takes approximately one<br />

hour to diagnose the system. If one of the more complex components (the components depicted on<br />

the right of Figure 2.2) are broken, <strong>and</strong> LEDs do not indicate malfunctioning, the identification of<br />

these as the wrongdoers takes much more time. In these situations, developers of the power supply<br />

example need to help, because they know more about the system.<br />

2.4 Optimal Fault Diagnosis<br />

This section defines <strong>and</strong> clarifies the problem that this thesis addresses precisely. The former described<br />

the current approach to fault diagnosis at PMS. A better diagnostic approach can only be<br />

suggested if it is known which items make a fault diagnosis process good. If these items are known,<br />

it is possible to evaluate the current approach, <strong>and</strong> evaluate alternative approaches. Section 2.4.1<br />

introduces the items that an ideal fault diagnosis would have. Section 2.4.2 presents the evaluation<br />

using the items of an ideal fault diagnosis process as criteria. The final subsection shows what items<br />

could possibly be introduced, or improved, in a new approach to fault diagnosis in order to achieve<br />

higher dependability.<br />

2.4.1 Ideal Fault Diagnosis<br />

The ideal approach to fault diagnosis utilizes as much information as possible in the search for root<br />

causes of failures, in a way that optimizes dependable operation at minimum costs <strong>and</strong> risks. This<br />

applies to PMS as well as to all companies constructing embedded systems. In order to address<br />

this ideal approach as much as possible, items are identified that make a fault diagnosis approach<br />

’perfect’. These items can be used to evaluate diagnostic approaches. Dash <strong>and</strong> Venkatasubramanian<br />

list some characteristics that a diagnostic system should ideally posses [9]. With this list as<br />

a starting point, <strong>and</strong> by interviews with PMS employees, it is derived that the following item are<br />

present in an ideal approach.<br />

1. Accuracy. Accuracy is the extent to which a diagnosis produced by the diagnostic process<br />

agrees with reality. If a diagnosis is not accurate (inaccurate) there can be two situations:<br />

15

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!