29.09.2014 Views

Casestudie Breakdown prediction Contell PILOT - Transumo

Casestudie Breakdown prediction Contell PILOT - Transumo

Casestudie Breakdown prediction Contell PILOT - Transumo

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Technische Universität Braunschweig<br />

Diplomarbeit<br />

AUSFALLPROGNOSEN MIT HILFE<br />

ERWEITERTER MONITORING SYSTEME<br />

(<strong>Breakdown</strong> Prediction by the Use of Extended Monitoring Systems)<br />

von<br />

Christian Kaak<br />

Februar 2007<br />

Institut für Wirtschaftswissenschaften,<br />

Lehrstuhl für Betriebswirtschaftslehre,<br />

insbesondere Produktion und Logistik<br />

Technische Universität Braunschweig<br />

Prüfer:<br />

Prof. Dr. T. Spengler<br />

Betreuer:<br />

Dr. Grit Walther


Table of Contents<br />

Index of Figures...............................................................................................................................IV<br />

Index of Tables.................................................................................................................................V<br />

Index of Formulas............................................................................................................................VI<br />

1 Introduction................................................................................................................................ 1<br />

1.1 Initial Position and Problem ............................................................................................. 1<br />

1.2 Goals of this Study and Approach................................................................................... 1<br />

2 Sensor Based Temperature Monitoring.................................................................................... 3<br />

2.1 Importance of Temperature Monitoring within Medical Laboratories.............................. 3<br />

2.2 Functioning and Behavior of Freezers and Fridges ........................................................ 4<br />

2.2.1 General Functioning of a Fridge.................................................................................. 4<br />

2.2.2 Technical Behavior of Fridges (Without External Influences)..................................... 5<br />

2.2.3 Technical Behavior of Freezers (Without External Influences)................................... 6<br />

2.2.4 Behavior in Practice..................................................................................................... 7<br />

2.2.5 Behavior in Case of a Malfunction............................................................................... 9<br />

2.3 Current Practice of Sensor Based Temperature Monitoring......................................... 10<br />

2.4 Problems and Potential Sources of Error...................................................................... 11<br />

2.4.1 The Lack of Information Problem .............................................................................. 12<br />

2.4.2 Potential Sources of Error ......................................................................................... 14<br />

2.4.3 Methodological Problems .......................................................................................... 15<br />

2.5 Aimed Goal and Requirements Analysis....................................................................... 16<br />

3 Current Monitoring Systems ................................................................................................... 19<br />

3.1 XiltriX’s Technical Basis................................................................................................. 19<br />

3.1.1 Basic Components of a XiltriX Installation ................................................................ 21<br />

3.1.2 Other Installation Possibilities ................................................................................... 22<br />

3.2 XiltriX’s Basic Functionality............................................................................................ 23<br />

3.2.1 Current Possibilities to Display and Analyze Stored Data ........................................ 26<br />

3.2.2 Documentation of Occurred Alarms .......................................................................... 30<br />

3.3 XiltriX’s Additional Features........................................................................................... 31<br />

3.3.1 Different Types of Attachable Digital Switches ......................................................... 31<br />

3.3.2 Time-Dependent Limit Settings................................................................................. 32<br />

3.3.3 Alarm-, SMS- and E-Mail-Programs.......................................................................... 33<br />

3.4 Review of XiltriX According to the Requirements Analysis........................................... 35<br />

3.5 Other Major Monitoring Products in the Market ............................................................ 36<br />

3.5.1 3M FreezeWatch and 3M MonitorMark Indicators............................................. 37<br />

3.5.2 2DI ThermaViewer..................................................................................................... 38<br />

3.5.3 Systems Offering Data Analysis in Retrospect ......................................................... 39<br />

4 Current State of Research ...................................................................................................... 43<br />

4.1 Current State within the Setting of Sensor Based Temperature Monitoring................. 43<br />

4.2 Current State within the Setting of Machinery Condition Monitoring ............................ 43<br />

4.3 Current State within the Setting of Measurement Data Analysis .................................. 46<br />

4.3.1 Basic Approaches...................................................................................................... 46<br />

4.3.2 A Generalized Approach ........................................................................................... 47<br />

II


4.4 Review of Current State of Research............................................................................ 53<br />

5 Possible and Promising Ways of Data Analysis..................................................................... 55<br />

5.1 The Six Possible Levels of Data Analysis ..................................................................... 55<br />

5.2 Different Kinds of Statistical Analysis............................................................................ 57<br />

5.3 Basic Descriptive Statistical Measures.......................................................................... 58<br />

5.4 Regression..................................................................................................................... 60<br />

5.4.1 The Determination of Regression Functions............................................................. 61<br />

5.4.2 The Major Problems of Regression........................................................................... 63<br />

5.5 Time Series Analysis ..................................................................................................... 65<br />

5.6 Failure- and Availability Ratios ...................................................................................... 67<br />

5.7 Markov Chains............................................................................................................... 68<br />

5.8 Inferential Statistics........................................................................................................ 72<br />

5.9 Data Mining.................................................................................................................... 73<br />

5.9.1 General Fields of Application .................................................................................... 73<br />

5.9.2 Artificial Neural Networks .......................................................................................... 75<br />

5.9.3 Non-Applicability of Artificial Neural Networks to Current Datasets ......................... 78<br />

5.10 Promising Analyzing Methods ....................................................................................... 79<br />

5.10.1 Promising Appliance of Basic Descriptive Statistics............................................. 79<br />

5.10.2 Detection of Changes in Behavior by the Use of Regression............................... 81<br />

5.10.3 Classification by Using Past Behavior .................................................................. 82<br />

5.10.4 Review................................................................................................................... 83<br />

6 Implementation and Case Study............................................................................................. 86<br />

6.1 Implementation of Promising Analyzing Methods ......................................................... 86<br />

6.2 Case Study .................................................................................................................... 89<br />

6.2.1 Detection of Changes in Behavior by Using Descriptive Statistics........................... 90<br />

6.2.2 Detection of Changes in Behavior by the Use of Regression................................... 98<br />

6.2.3 Classification of Alarms by the Use of Historical Data.............................................. 99<br />

6.3 Review ......................................................................................................................... 101<br />

6.4 Recommendations....................................................................................................... 102<br />

7 Summary............................................................................................................................... 105<br />

Bibliography.................................................................................................................................. 107<br />

Appendix 1 – Implementation of Interpolation.............................................................................. 111<br />

Appendix 2 – Implementation of Statistical Methods................................................................... 115<br />

Appendix 3 – Implementation of Data Mining Methods ............................................................... 127<br />

Erklärung (Statement) .................................................................................................................. 134<br />

III


Index of Figures<br />

Figure 2-2: Temperature Sequence of a Properly Working 6°C Passive Fridge [DEMO06]........... 5<br />

Figure 2-3: Temperature Sequence of a Properly Working 6°C Active Fridge [DEMO06] ............. 6<br />

Figure 2-4: Temperature Sequence of a -80°C Active Freezer [DEMO06]..................................... 7<br />

Figure 2-5: Temperature Sequence of a -80°C Passive Freezer [DEMO06] .................................. 8<br />

Figure 2-6: Temperature Sequence of a -20°C Active Freezer [DEMO06]..................................... 8<br />

Figure 2-7: Temperature Sequence of a Cryogenic Freezer in Practical Use [UMC06] ................. 9<br />

Figure 2-8: Lack of Information Problem Caused by Sensor Based Temperature Monitoring ..... 13<br />

Figure 2-9: The Problem of Unknown Behavior between Two Single Data Points....................... 13<br />

Figure 2-10: Estimated Answers of Statistics and Data Mining..................................................... 17<br />

Figure 3-1: Flowchart of the Temperature Monitoring Task .......................................................... 20<br />

Figure 3-2: XiltriX - Schematic Drawing of an Installation with Basic Components ...................... 22<br />

Figure 3-3: XiltriX - The Main Screen [DEMO06]........................................................................... 24<br />

Figure 3-4: XiltriX - Stored Data in Table Form [DEMO06] ........................................................... 27<br />

Figure 3-5: XiltriX - Stored Data in Graphical Form [DEMO06]..................................................... 28<br />

Figure 3-6: XiltriX – Available Statistical Information [DEMO06]................................................... 29<br />

Figure 3-8: XiltriX - Time Dependent Limit Settings [DEMO06]..................................................... 33<br />

Figure 3-9: XiltriX - Setting up an Alarm Relay [DEMO06] ............................................................ 34<br />

Figure 3-13: Centron - A Sample Graph with Multiple Scales [Rees06] ....................................... 42<br />

Figure 4-1: General Overview of the Generalized Approach ([Daßler95], p. 22) (adapted) ......... 48<br />

Figure 4-2: A Delayed Trend Recognition Due to Removal of "Outliers" ...................................... 49<br />

Figure 5-1: Two Samples of Regression ([Bourier03], p. 167) (adapted)...................................... 61<br />

Figure 5-2: Incorrect Regression Function due to an Outlier ([Eckey02], p. 180) (adapted) ........ 63<br />

Figure 5-3: Correct Regression Function ([Eckey02], p.180) (adapted)........................................ 63<br />

Figure 5-4: Sales of an Industrial Heater [Chatfield04].................................................................. 66<br />

Figure 5-5: Sample Transition Probability Graph........................................................................... 70<br />

Figure 5-6: Functioning of an Artificial Neuron ([Hagen97], p. 8) (adapted).................................. 76<br />

Figure 6-1: Exported XiltriX Data (An Excerpt) .............................................................................. 86<br />

Figure 6-2: Temperature Overview of the Selected Sample Dataset............................................ 89<br />

Figure 6-3: Maximum Values at Daytime....................................................................................... 91<br />

Figure 6-4: Maximum Values at Nighttime..................................................................................... 91<br />

Figure 6-5: Minimum Values at Daytime........................................................................................ 93<br />

Figure 6-6: Minimum Values at Nighttime...................................................................................... 93<br />

Figure 6-7: Mean Values at Daytime.............................................................................................. 94<br />

Figure 6-8: Mean Values at Nighttime ........................................................................................... 94<br />

Figure 6-9: Standard Deviation at Daytime.................................................................................... 95<br />

Figure 6-10: Standard Deviation at Nighttime................................................................................ 95<br />

Figure 6-11: Daily Door Openings and Temperature Distribution of the Selected Dataset .......... 98<br />

Figure 6-12: Regression Function for the Selected Dataset.......................................................... 99<br />

IV


Index of Tables<br />

Table 2-1: Error of First and Second Kind ..................................................................................... 12<br />

Table 3-1: Listing of Existing Table Color Codes and Their Meaning ........................................... 25<br />

Table 3-2: Listing of Existing Status bar Color Codes and Their Meaning.................................... 26<br />

Table 3-3: Compliance of XiltriX According to the Requirements Analysis................................... 36<br />

Table 3-4: Compliance of 3M Indicators According to the Requirements Analysis................... 38<br />

Table 3-5: Compliance of the 2DI ThermaViewer According to the Requirements Analysis........ 39<br />

Table 4-1: Compliance of the Generalized Approach According to the Requirements Analysis .. 54<br />

Table 5-1: Estimated Improvements .............................................................................................. 85<br />

Table 6-1: Import Problems of Tested Software Products............................................................. 86<br />

Table 6-2: The Chosen Deltas ....................................................................................................... 96<br />

Table 6-3: Reported Notifications (Based on Nighttime Data)....................................................... 97<br />

Table 6-4: Classification of Alarms............................................................................................... 100<br />

Table 6-5: Results of Classification According to Single Criterions............................................. 100<br />

Table 6-6: Achieved Improvements ............................................................................................. 102<br />

V


Index of Formulas<br />

Formula 4-1: Threshold Value to Determine Potential Outliers..................................................... 49<br />

Formula 4-2: Calculation of Noise.................................................................................................. 51<br />

Formula 4-3: Calculation of Curve Stability.................................................................................... 52<br />

Formula 4-4: Calculation of Prediction Stability ............................................................................. 52<br />

Formula 5-1: The Median Formula................................................................................................. 59<br />

Formula 5-2: The Arithmetic Mean Formula .................................................................................. 59<br />

Formula 5-3: The Standard Deviation Formula.............................................................................. 60<br />

Formula 5-4: Method of Least Squares ......................................................................................... 61<br />

Formula 5-5: Method of Least Squares for an Assumed Linear Trend ......................................... 62<br />

Formula 5-6: Regression Function for Describing Linear Trend.................................................... 62<br />

Formula 5-7: Coefficient of Determination ..................................................................................... 64<br />

Formula 5-8: The Additive Component Model ............................................................................... 66<br />

Formula 5-9: The Multiplicative Component Model ....................................................................... 66<br />

Formula 5-10: The Definition of Availability [Masing88]................................................................. 68<br />

Formula 5-11: The Markov Property .............................................................................................. 68<br />

Formula 5-12: Transition Probability Matrix ................................................................................... 69<br />

Formula 5-13: Conditions for the Transition Probability Matrix...................................................... 69<br />

Formula 5-14: Sample Transition Probability Matrix...................................................................... 70<br />

Formula 5-15: Transition Probabilities of Several Changes in a Row........................................... 70<br />

Formula 5-16: Formula of Chapman-Kolmogorov ......................................................................... 70<br />

Formula 5-17: Formula of Chapman-Kolmogorov (Simplified Version)......................................... 71<br />

Formula 5-18: Identity Matrix as an Example of a non Converging Markov Chain....................... 71<br />

Formula 5-19: Definition of Neurons .............................................................................................. 75<br />

Formula 5-20: Definition of V and F ............................................................................................... 76<br />

Formula 5-21: Determination of Error ............................................................................................ 77<br />

Formula 5-22: The Delta Rule........................................................................................................ 78<br />

Formula 5-23: Hebb Learning Rule................................................................................................ 78<br />

Formula 6-1: Regression Function and Coefficient of Determination............................................ 98<br />

VI


1 Introduction<br />

1.1 Initial Position and Problem<br />

As more and more technical devices do “mission critical” tasks within industry and<br />

medical research, monitoring of such devices is becoming increasingly important.<br />

Possible malfunctions could damage these expensive products or its contents, which<br />

can lead to very high costs (both direct and indirect costs; so called collateral<br />

damage). That is why electronic sensor based monitoring systems have become<br />

popular during the last years.<br />

One of these systems is XiltriX, a hard- and software combination from the company<br />

<strong>Contell</strong>/IKS, which is currently applied to laboratory equipment. The basic<br />

functionality is to monitor and to record temperature of fridges and/or CO 2<br />

concentration within incubators to prevent damage of goods, stored in these devices.<br />

Elementary functionalities as well as some useful tools are already implemented. The<br />

customer or <strong>Contell</strong>/IKS defines critical minimum and maximum temperature limits<br />

and as soon as a value is exceeded, the system warns by means of E-Mail or SMS.<br />

The main question is now in which direction the development of the software should<br />

continue. The idea of this study is to extend the existing “reactive” XiltriX to a more<br />

“pro-active” system that recognizes trends and notifies a person in charge before<br />

minimum and maximum critical temperature limits are exceeded. In addition to that,<br />

XiltriX should offer additional decision support to allow a person in charge to better<br />

classify the system’s condition within situations of exceptional temperature levels.<br />

After comparing XiltriX to other major monitoring products, this diploma thesis will<br />

work out some promising ideas to show the possibilities for further development. The<br />

main focus is the recorded monitoring data as currently obtained from the field by<br />

XiltriX. At the moment this data is only accessible numerically or in form of a graph.<br />

Analyzing the graphs manually in retrospect already helped to predict malfunctions,<br />

but the results rely on experience and especially instinct of <strong>Contell</strong>/IKS staff. At<br />

present it is not clear how reliable this intuitive data analysis really is. It is also<br />

problematic that with this kind of data analysis even an experienced person needs a<br />

lot of time, because graphs from every single sensor have to be looked at manually.<br />

1.2 Goals of this Study and Approach<br />

The main task of research is now to determine, whether and in which way it is<br />

possible to reliably predict malfunctions and to give decision support to the customer.<br />

1


Therefore, statistical and data mining methods shall be applied to currently available<br />

datasets.<br />

Hence, existing customer data has to be collected and analyzed. Furthermore, above<br />

mentioned methods have to be evaluated on their feasibility to offer additional<br />

decision support and to reliably predict malfunctions. This study will stick to the data<br />

currently monitored by sensors in the field and will add no new measurement data.<br />

Moreover, it is necessary to point out the increase in value for the customer for the<br />

found solutions.<br />

First step of research is to define what monitoring is about and to explain its general<br />

importance, current practice, existing problems and a requirements analysis of a<br />

monitoring system within the setting of sensor based temperature monitoring of<br />

fridges. This is done in chapter 2. Afterwards a review of XiltriX and other major<br />

monitoring products is given in chapter 3 to point out the level of compliance with the<br />

worked out requirements. The succeeding chapter 4 will review the current state of<br />

research.<br />

Based on these results in combination with literature research, chapter 5 introduces<br />

suggestions, in which way the system could be improved. Aimed results are:<br />

1. To gain additional knowledge of the cooling device’s condition from recorded<br />

datasets to offer additional decision support in case of an exceptional<br />

temperature level<br />

2. To offer a software that reliably predicts upcoming malfunctions<br />

Offering additional knowledge to the customer, regarding the equipment’s condition,<br />

leads to the idea to determine, what important information could be retrieved from<br />

currently recorded datasets. Therefore, statistical and data mining methods are<br />

evaluated on being able to offer additional information. This evaluation contains<br />

questions like:<br />

• Which statistical methods can be applied?<br />

• What knowledge gain do they offer?<br />

• What are the benefits for the operational staff?<br />

A determination whether the possible knowledge gain is sufficient to also reliably<br />

predict upcoming malfunctions and a succeeding case study will conclude this study.<br />

2


2 Sensor Based Temperature Monitoring<br />

This diploma thesis focuses on sensor based temperature monitoring of freezers and<br />

fridges within medical laboratories. Due to the functioning of a fridge and insufficient<br />

data of a high quantity of possible external influences, this setting is faced with<br />

particular problems. The Dutch company <strong>Contell</strong>/IKS supported this thesis by<br />

providing a lot of information about their sensor based monitoring system XiltriX.<br />

Moreover, <strong>Contell</strong>/IKS rendered interviews with several employees of the UMC St.<br />

Radboud (University hospital of Nijmegen, the Netherlands) possible. This customer<br />

also provided stored historical data, which enables a validation of promising<br />

analyzing methods.<br />

Based on the interview’s results, this chapter will highlight the importance of sensor<br />

based temperature monitoring of cooling devices within medical laboratories.<br />

Furthermore, typical behaviors of cooling devices as well as currently applied<br />

monitoring methods are introduced. The identification of possible problems and a<br />

requirements analysis for a perfect working monitoring system conclude this chapter.<br />

2.1 Importance of Temperature Monitoring within Medical<br />

Laboratories<br />

As already pointed out in the last chapter, sensor based temperature monitoring<br />

becomes increasingly important within many different settings. Its task is to reliably<br />

determine the condition of monitored devices. In general, a monitored device should<br />

meet the following criteria to be classified as OK [Weerdesteyn06]:<br />

1. Current state is within predefined specifications<br />

2. General behavior did not change significantly on the short-run<br />

3. General behavior did not change significantly on the long-run<br />

4. Presumably the behavior will not change significantly in the future<br />

Such a classification is very important, because a lot of medical goods have to be<br />

kept cool. Blood samples, for example, need a constant temperature of about 6°C.<br />

Changes in temperature for a longer time are dangerous to these blood samples.<br />

Even more critical are cryogenic fridges. Their samples are stored at -80°C or even<br />

cooler. A freezer’s malfunction can destroy these samples within a very short time.<br />

That has to be avoided because most of them are part of research work and<br />

irrecoverable. The contents of a fridge normally range in age from a few days to more<br />

than thirty years. That is why a breakdown of a freezer can lead to a loss of more<br />

3


than half a million Euro. As a result, a possible breakdown has to be recognized as<br />

soon as possible to be able to save the contents to other devices. [Nijmegen06]<br />

Very important to know is that events like this cannot be insured because of the high<br />

risk. Therefore, many medical laboratories and especially hospitals are very<br />

interested in an intelligent monitoring solution, which is able to recognize upcoming<br />

failures. [Weerdesteyn06]<br />

2.2 Functioning and Behavior of Freezers and Fridges<br />

In order to develop new or improve existing sensor based monitoring approaches,<br />

this section will introduce mandatory knowledge of the functioning and the behavior<br />

of cooling devices.<br />

2.2.1 General Functioning of a Fridge<br />

Although different kinds of cooling devices with different technology do exist, they are<br />

all based on the same idea, the cooling cycle. Figure 2-1 illustrates the cooling cycle<br />

of a regular household refrigerator. The basic idea of this cycle is to transport heat<br />

energy from the inside to the outside of a fridge.<br />

4<br />

2<br />

1 3<br />

Figure 2-1: Cooling Cycle of a<br />

Household Refrigerator (adapted)<br />

[UniMunich06]<br />

The exemplary cycle on the left uses a<br />

compressor. Within this cycle there is a refrigerant.<br />

It reaches the compressor (4) vaporized. The<br />

compressor compresses the gas within the<br />

condenser coil (1). Because of the generated high<br />

pressure, the vaporized refrigerant becomes liquid<br />

and emits heat. After cooling down, the refrigerant<br />

passes the expansion valve (2). The second half of<br />

the cycle is called evaporator coil (3). Within this<br />

low pressured part the liquid refrigerant starts to<br />

vaporize again. Therefore, energy is needed. It is<br />

taken from the air inside the fridge, so that the<br />

inside is cooling down. This vaporized refrigerant<br />

reaches the compressor and the cycle starts again. [UniMunich06]<br />

Fridges with a cooling cycle like that are called active fridges. Within laboratories and<br />

the industry a second class of fridges does exist. These devices do not have an own<br />

4


compressor. They are served by a centralized unit with cold air. Devices like that are<br />

called passive fridges.<br />

2.2.2 Technical Behavior of Fridges (Without External Influences)<br />

Due to the just described functioning, active cooling devices as well as passive ones<br />

do not have a constant temperature. In fact, they warm up a bit, start cooling down<br />

until they are cold enough to turn off again. Depending on the kind of cooling device,<br />

the temperature sequence looks differently. This technical behavior will be<br />

exemplified with some temperature sequences of different kinds of cooling devices to<br />

receive an impression of possible behavior.<br />

These examples were taken from the <strong>Contell</strong>/IKS XiltriX demo system, which was<br />

built up for testing and presentation purposes. This system monitors some demo<br />

fridges 24/7. As these fridges are normally empty and the doors are kept close, the<br />

collected data offers an overview of typical behavior without external influences.<br />

Figure 2-2 pictures a temperature sequence of a properly working 6°C passive fridge<br />

of about eighteen hours. Most of the time, temperature oscillates between 4°C and<br />

6°C. Moreover, nearly every cooling cycle takes about twenty minutes of time.<br />

Figure 2-2: Temperature Sequence of a Properly Working 6°C Passive Fridge [DEMO06]<br />

Figure 2-2 contains three eye-catching cycles. The first two are between 16 and 18<br />

o’clock. One cycle the fridge cools down, although the upper limit of 6°C is not<br />

reached. Three cycles later the fridge heats up to more than 7°C. The last suspicious<br />

cycle is around 2 o’clock in the morning. The fridge reaches a temperature of 6.8°C<br />

before it starts to cool down again.<br />

5


As the following graphs from other machines will show, a behavior like this has to be<br />

classified as normal. Every fridge behaves “suspiciously” sometimes without really<br />

malfunctioning. Actually, machines of the same type behave differently. Also fridges<br />

identical in construction could show different behavior for unknown reason. 1<br />

Figure 2-3: Temperature Sequence of a Properly Working 6°C Active Fridge [DEMO06]<br />

Figure 2-3 shows a temperature sequence of an active fridge. Just like the previous<br />

passive one, it should have a temperature of 6°C. In comparison to each other, the<br />

active fridge never exceeded 6°C within the shown two days. In contrast, the passive<br />

fridge exceeded 6°C about every 20 minutes. Another difference can be found in the<br />

shape of the graph. Figure 2-2 shows a more regular shape with very short cooling<br />

cycles. This is typical for a passive fridge. Figure 2-3 does not contain such a regular<br />

pattern. The cooling cycles are similar but vary in shape. Also the duration of the<br />

passive fridge’s cooling cycle is more than twice as short as the one from the active<br />

device, which is about 43 minutes.<br />

Most results of this comparison cannot be generalized because counterexamples do<br />

exist [DEMO06]. The only indication for a passive fridge is the regular pattern with<br />

very short cooling cycles and a larger deviation. All other differences could be the<br />

other way round when comparing two other 6°C fridges. 2<br />

2.2.3 Technical Behavior of Freezers (Without External Influences)<br />

Looking at freezers even complicates the situation. Figure 2-4 pictures the<br />

temperature sequence of a -80°C active freezer. Although it operates slightly above<br />

1 Reasons are unknown because of the lack of information problem. (See section 2.4.1 for details)<br />

2 See [DEMO06], [UMC06] for further details<br />

6


the specified value, it works very accurately because total deviation is less than 2°C<br />

within the displayed time of five days. On the other hand, the graph contains a trend.<br />

Within four days the daily mean increased more than half a degree. An event like this<br />

has to be recognized and surveyed, when classifying the system’s behavior.<br />

Figure 2-4: Temperature Sequence of a -80°C Active Freezer [DEMO06]<br />

Figure 2-5 shows the behavior of a -80°C passive freezer. This one behaves totally<br />

different. The data does not contain a trend but oscillates much more than the one<br />

above. Furthermore, -80°C is never reached and total deviation is more than 8°C, so<br />

that temperature exceeds -70°C regularly.<br />

Figure 2-6 shows another kind of freezer. The red lines signalize door openings. The<br />

special thing about that device is that it needs a regeneration cycle every few hours<br />

due to technical reasons. Compared to the previous datasets, the oscillation is much<br />

higher and the shape of the graph is more irregular. But as this is normal behavior for<br />

this kind of freezer, it should be classified as OK.<br />

2.2.4 Behavior in Practice<br />

As mentioned at the beginning of this section, the exemplified temperature<br />

sequences originate from the <strong>Contell</strong>/IKS demo system so far. Since these monitored<br />

cooling devices are empty and not in use, they are not externally influenced by users.<br />

In practice, a cooling device can be influenced by a large quantity of variables. 3<br />

3 See section 2.4.2 for details<br />

7


Figure 2-5: Temperature Sequence of a -80°C Passive Freezer [DEMO06]<br />

Figure 2-6: Temperature Sequence of a -20°C Active Freezer [DEMO06]<br />

Hence, the temperature sequence of a corresponding monitored cooling device<br />

changes to a more irregular pattern. Types and origins of external influences will be<br />

identified in section 2.4.2. Up to then, the following example should just give a<br />

general idea of temperature sequences in practice.<br />

Figure 2-7 shows the behavior of a properly working cryogenic -180°C freezer in<br />

practice. The data originates from the UMC St. Radboud (University hospital of<br />

Nijmegen, the Netherlands) and represents typical behavior for that kind of devices. 4<br />

In contrast to previous examples, this figure pictures a larger time slice. These ten<br />

months are chosen to give an impression of practical behavior on the long run.<br />

Recognizable is a baseline at about -183°C. Due to the different scaling, the figure<br />

does not picture the single cooling cycles any more, although they do exist. Instead<br />

4 The university hospital of Nijmegen provided their datasets only as a copy from their XiltriX system.<br />

That is, why Matlab was used to draw this graph. Beside the slightly different appearance, the data<br />

would look the same in Xiltrix.<br />

8


of this, over a hundred irregular peaks are pictured that cannot be traced back to<br />

typical technical behavior. In fact, not a single peak is caused by a technical<br />

malfunction [Nijmegen06]. Hence, monitoring systems have to be able to figure out,<br />

whether such a peak is caused by a technical malfunction or by external influences.<br />

Figure 2-7: Temperature Sequence of a Cryogenic Freezer in Practical Use [UMC06]<br />

2.2.5 Behavior in Case of a Malfunction<br />

Unfortunately, the provided 36 datasets of the UMC St. Radboud do not contain a<br />

single technical malfunction [UMC06], [Nijmegen06]. In fact, most cooling devices<br />

operate for years without having a single failure, which leads to a very low probability<br />

of a technical malfunction. Hence, it is not possible to introduce a sample pattern<br />

here. Nevertheless, the following criteria are identified to hint a malfunction in case of<br />

not being influenced externally [Weerdesteyn06]:<br />

1. Form and shape of the temperature sequence changes significantly on the<br />

short-run<br />

2. Form and shape of the temperature sequence changes significantly on the<br />

long-run<br />

3. Temperature exceeds the range of normal operation<br />

9


A temperature exceeding without external influence is a definite indication of a<br />

cooling device’s malfunction. But most technical failures are caused by compressor<br />

breakdowns. Usually, such a breakdown does not appear suddenly but predictable<br />

because form and shape of the corresponding temperature sequence starts to<br />

diversify, before the compressor actually breaks down. An early recognition could<br />

allow predictive maintenance [Weerdesteyn06].<br />

2.3 Current Practice of Sensor Based Temperature Monitoring<br />

The basic idea of currently applied sensor based temperature monitoring is to attach<br />

a sensor to a cooling device. The collected information is used to evaluate the<br />

condition of a monitored fridge. The assumption behind this idea is that a cooling<br />

device is malfunctioning or at least has to be looked at, when a regular temperature<br />

range is exceeded.<br />

Based on this assumption, the current main approach of temperature monitoring is to<br />

define critical minimum and/or maximum temperature limits, which may not be<br />

exceeded. This idea leads to three different kinds of temperature monitoring in<br />

current practice:<br />

1. Temperature verification in retrospect<br />

2. Online comparison of current temperature values to a specified range<br />

3. Online comparison and data analysis in retrospect<br />

In general, the temperature verification in retrospect is based on a single indication<br />

sensor that operates as an isolated application. The task of that kind of sensor is just<br />

to indicate, whether a temperature exceeding occurred during monitoring time.<br />

Furthermore, advanced sensors are able to indicate the duration of exceeding or the<br />

most critical temperature value. This approach only offers very few information and is<br />

not designed to avoid critical temperatures but to report them in retrospect. 5 Hence,<br />

this approach is often used within the setting of transportation of frozen goods but not<br />

suitable for monitoring important samples that may not defrost in any case.<br />

The second kind of temperature monitoring is often found in practice. The basic idea<br />

is to just compare the actual measurement values to the predefined temperature<br />

range within short time intervals. In case of a temperature exceeding, an alarm is<br />

raised immediately to notify a person in charge. In contrast to the first introduced kind<br />

of temperature monitoring, this one can operate as an isolated application as well as<br />

5 See section 3.5.1 for a sample product<br />

10


a centralized one. An isolated application is characterized by using its own features<br />

to raise an alarm, like built-in flashlights or sirens. A centralized application (e.g.<br />

XiltriX) transfers information of critical situations to a centralized unit that displays the<br />

current status of all monitored devices at one place.<br />

The third kind of temperature monitoring is an extension to the just presented one.<br />

Besides comparing actual temperature values to predefined intervals, temperature<br />

sequences of the single devices are stored. Again, this kind of temperature<br />

monitoring can be implemented as an isolated application as well as a centralized<br />

one. The gained historical temperature sequences enable data analysis in retrospect<br />

to obtain changes in behavior over time.<br />

Up to now, this data analysis is kept very simple. Beside basic visualization<br />

possibilities to evaluate the behavior manually or some provided statistical measures,<br />

current temperature monitoring products in the market do not contain more complex<br />

analyzing methods. 6<br />

Hence, the main task of this diploma thesis is to find additional analyzing methods to<br />

offer more precise status information of monitored cooling devices. To be able to do<br />

that, the next section will identify problems and potential sources of error, current<br />

sensor based temperature monitoring is faced with.<br />

2.4 Problems and Potential Sources of Error<br />

Data analysis (e. g. statistics) can lead to two different kinds of error<br />

([Scharnbacher04], p. 85):<br />

1. Error of first kind<br />

2. Error of second kind<br />

Based on a null hypothesis (H 0 = Cooling device is OK), four different cases are<br />

possible as pictured in Table 2-1. The aimed goal, within the setting of temperature<br />

monitoring of cooling devices, is the ability to always reach the right decisions. As<br />

referred in section 2.1, the task of monitoring within this setting is mission critical.<br />

Hence, an error of second kind has to be avoided in any case. In contrast, an error of<br />

first kind is only a false alarm that is indeed disturbing but not dangerous.<br />

6 See chapter 3 for details<br />

11


Table 2-1: Error of First and Second Kind<br />

H 0 is correct H 0 is wrong<br />

Acceptance H 0 Right decision Error of second kind<br />

Rejection H 0 Error if first kind Right decision<br />

The succeeding subsection will introduce the major problem, sensor based<br />

temperature monitoring is faced with and its consequences on first and second error.<br />

2.4.1 The Lack of Information Problem<br />

Currently the major problem within the setting of sensor based temperature<br />

monitoring of cooling devices is a lack of information. All well known systems in the<br />

market only attach a single temperature sensor to a fridge. That is why in most cases<br />

only the current temperature of a cooling device is available for analyzing purposes.<br />

Advanced systems like the below introduced XiltriX offer, for instance, the possibility<br />

to add an additional door sensor. So, there is at least a second piece of information<br />

available.<br />

In fact, there are many factors that have an influence on the temperature inside a<br />

fridge. Figure 2-8 specifies some of the factors and illustrates the problem of current<br />

systems. Of course, it would be possible to add additional sensors to every<br />

monitored device. But their quantity is always kept small to minimize expenses<br />

[Nijmegen06]. For example, every temperature sensor for XiltriX causes additional<br />

costs of about 500€ [Weerdesteyn06]. This leads to a one sensor usage, sometimes<br />

in combination with a door opening sensor.<br />

This lack of information problem causes the cooling device to be a black box and<br />

disables the finding of real causes of temperature deviations. Especially the needed<br />

information, whether a fridge is significantly externally influenced within a certain time<br />

cannot be obtained for sure. 7 This problem leads to potential sources of error, when<br />

analyzing temperature sequences. As an error of second kind has to be avoided in<br />

any case, the quantity of first kind errors increases within situations of unknown<br />

influences.<br />

7 See section 2.4.2 for details<br />

12


Figure 2-8: Lack of Information Problem Caused by Sensor Based Temperature Monitoring<br />

A second problem even increases the lack of information. It is caused by the<br />

unknown behavior between two single measuring points. Figure 2-9 exemplifies a<br />

rising and falling of temperature between two of these points. Analyzing this data in<br />

retrospect would disregard this actual behavior. Furthermore, a graphical and a<br />

numerical analysis would assume a constant temperature within this interval, as<br />

indicated by the red dashed line.<br />

Figure 2-9: The Problem of Unknown Behavior between Two Single Data Points<br />

13


2.4.2 Potential Sources of Error<br />

A change in cooling behavior could indeed be caused by a technical malfunction. But<br />

as the probability for such a malfunction is very small 8 , a change is normally caused<br />

by other external influences. Due to the lack of information problem, the reason for<br />

an abnormal behavior cannot always be obtained. This subsection identifies common<br />

influences, which can lead to false alarms. They can be divided into two groups:<br />

1. Environmental influences<br />

2. User interaction<br />

Environmental influences are rather rare. Basically, all imaginable environmental<br />

changes could influence the behavior of a cooling device. But in reality, only two<br />

common factors are identified that really change the temperature sequence, although<br />

the technical condition remains the same: [Weerdesteyn06]<br />

1. A significant change in room ambient temperature<br />

2. A power failure<br />

A change in room ambient temperature generally changes the warming-up and<br />

cooling-down behavior of freezers and fridges, so that the changing temperature<br />

sequence of the corresponding cooling device could lead to a rejection of H 0 . This<br />

decision has to be classified as an error of first kind. In contrast, a raised alarm<br />

caused by a power failure should be classified as right decision because a situation<br />

like this would endanger the stored samples, although the technical condition of the<br />

cooling device is still OK.<br />

But as these environmental influences are very infrequent, the main focus has to be<br />

kept on changes in behavior because of user interaction. In general, this behavior is<br />

not measured. Only some monitoring products attach an additional door sensor to<br />

monitored devices to recognize at least door openings. In fact, door openings<br />

influence the cooling behavior significantly, because warm air enters the fridge.<br />

Especially freezers heat up very fast, so that an open door leads to an alarm within<br />

very short time. [Nijmegen06]<br />

Aside from door openings, the condition of a newly inserted sample as well as the<br />

filling level of a cooling device is a significant influencing factor. An insertion of warm<br />

samples leads to an enduring heating up, even if the door is already closed again.<br />

8 See section 2.2.5 for details<br />

14


Moreover, the fridge’s filling level can vary the cooling-down time, so that form and<br />

shape of the corresponding temperature sequence changes, although the technical<br />

condition remains the same.<br />

Beside these general existing sources of error, the current practice is faced with<br />

additional problems that originate from the currently applied method, which was<br />

introduced in section 2.3.<br />

2.4.3 Methodological Problems<br />

The presented graphs within section 2.2 already exemplified many different<br />

behaviors of fridges and freezers. These examples were chosen to show the difficulty<br />

of an accurate classification of different kinds of behavior as normal operation or<br />

malfunction.<br />

The currently applied method to predefine critical temperature limits only allows a<br />

classification that is based on the actual temperature value. 9 Hence, as soon as<br />

temperature rises above the predefined maximum or falls below the predefined<br />

minimum, the cooling device is classified as malfunctioning. This method could<br />

indicate a bad technical condition of a fridge. But due to the lack of information and<br />

other possible error sources, it is impossible to prove a malfunction by using this<br />

method.<br />

Since an error of second kind has to be avoided in any case, H 0 has to be rejected<br />

every time, temperature limits are exceeded. This leads to a very high number of<br />

errors of first kind, because of the very low probability of a real technical<br />

malfunction. 10<br />

Beside this high number of false alarms, another methodological problem does exist.<br />

As mentioned in section 2.2.5, most malfunctions occur slightly, so that they could be<br />

recognized before temperature is exceeded. Such a change in form and shape of a<br />

temperature sequence is not recognized by the current method. Hence, situations<br />

like that lead to an error of second kind because H 0 is accepted, although the system<br />

starts to malfunction.<br />

Also the required recognition of changes in behavior on the long-run is only possible<br />

to some extent with the existing method. Typically, a change is bound to significant<br />

9 See section 2.3 for details<br />

10 See section 2.2.5 for details<br />

15


higher or lower temperatures. In that case, the temperature exceeds one of the<br />

predefined limits regularly and H 0 is rejected.<br />

Problematic are slight changes within the defined temperature range. A small<br />

increase of mean temperature, for instance, typically also increases the peak values<br />

and leads to a temperature exceeding. Of course, a small increase of mean<br />

temperature with unchanged peaks will not be recognized. This would cause an error<br />

of second kind again, because H 0 is accepted, although the monitored device could<br />

already malfunction.<br />

Beside all these problems, defining appropriate critical temperature limits is the<br />

greatest methodological problem. On the one hand, predefined limits a bit outside the<br />

typical temperature range would decrease the error of second kind, because already<br />

slight changes in temperature lead to a rejection of H 0 . On the other hand, nearly<br />

every external influence also leads to a rejection of H 0 , which has to be classified as<br />

error of first kind in nearly all cases.<br />

In practice, critical temperature limits are normally defined with a higher span to<br />

reduce the quantity of false alarms, caused by external influences. As mentioned<br />

before, this behavior increases the probability of an error of second kind. 11<br />

To be able to improve this unacceptable situation, the next section will determine a<br />

requirements analysis as basis for finding new methods.<br />

2.5 Aimed Goal and Requirements Analysis<br />

As mentioned in section 1.2, the aimed goal is to improve the current situation by<br />

offering decision support. This decision support can be offered by providing more<br />

information to a person in charge than just current temperature and status<br />

information of an optionally installed door opening sensor. This additional information<br />

should enable the responsible person, to classify the current behavior of a cooling<br />

device more precisely.<br />

As the attachment of additional sensors shall not be regarded by this diploma<br />

thesis 12 , the only way to gain additional information of a cooling device is the analysis<br />

of stored historical temperature sequences. Many higher developed systems already<br />

11 Figure 2-7 on page 9 pictures that problem quite well. The red dashed line marks the predefined<br />

maximum critical temperature. As long as this temperature is not exceeded, the null hypothesis H 0 is<br />

accepted, even if the cooling device is already malfunctioning.<br />

12 See section 1.2 for details<br />

16


store this kind of data but only offer basic visualization possibilities and sometimes<br />

basic statistical summarizations. Moreover, systems like that currently only allow data<br />

analysis by hand.<br />

This leads to a high amount of stored historical data with currently very few use. The<br />

main idea is now to test statistical and data mining methods on applicability to<br />

improve the current situation of rare information. Especially reliable answers on the<br />

given criteria from the beginning of chapter 2 would offer great decision support, as<br />

pictured in Figure 2-10.<br />

Figure 2-10: Estimated Answers of Statistics and Data Mining<br />

One hundred percent reliable answers to the questions on the right would allow a<br />

perfect classification of cooling devices as OK or malfunctioning. But even if the<br />

answers could only be given with a lower reliability, a possible knowledge gain could<br />

at least support the decision of the current technical condition and put it on a larger<br />

basis than just the current temperature.<br />

Beside these four criteria, section 2.1 identified another two important requirements.<br />

Since the stored samples are normally high valued and easy to destroy, a monitoring<br />

approach has to be able to identify failures as soon as they are recognizable.<br />

Because an early detection leads to additional time to save stored samples to other<br />

fridges. Moreover, it must be possible to avoid an error of second kind in any case.<br />

17


According to section 2.2.5 another requirement is the ability to recognize external<br />

influences, because only changes that cannot be traced back on these influences<br />

have to be classified as malfunction. The following list summarizes again the<br />

requirements analysis:<br />

• The monitoring approach is able to classify the current state of a monitored<br />

device<br />

• The monitoring approach is able to recognize significant changes of general<br />

behavior on the short-run<br />

• The monitoring approach is able to recognize significant changes of general<br />

behavior on the long-run<br />

• The monitoring approach is able to predict upcoming failures<br />

• The monitoring approach is able to identify failures as soon as they are<br />

recognizable<br />

• The monitoring approach is able to avoid an error of second kind in any case<br />

• The monitoring approach is able to recognize external influences<br />

Based on these requirements, chapter 5 will introduce promising statistical and data<br />

mining methods, which will be tested on feasibility in the following. But before that,<br />

the next chapter will introduce XiltriX and other major sensor based monitoring<br />

products and will review them according to the just worked out requirements.<br />

18


3 Current Monitoring Systems<br />

The last chapter pointed out the existing problems of the temperature monitoring task<br />

and limitations of the current approach of just setting critical temperature limits. This<br />

chapter will introduce currently available monitoring systems to identify existing<br />

problems. The main focus is kept on XiltriX, but section 3.5 will review other products<br />

as well and will point out differences.<br />

XiltriX is a monitoring system that is developed by the Dutch company <strong>Contell</strong>/IKS. It<br />

consists of a combination of hard- and software, which realizes the basic tasks of<br />

monitoring in the setting of medical laboratories. The basic idea is to attach sensors<br />

to cooling devices and to collect the measurement data on a centralized web server.<br />

In case of an exceeding of a predefined temperature limit, the system is able to notify<br />

a person in charge locally and remotely by using flashlights or SMS for instance.<br />

The basic development of this system started in 1991. In that year the company<br />

IKS 13 published their first monitoring system. It was named JS and was built in<br />

cooperation with several Dutch blood banks and aqua labs. During the years the<br />

system was improved by implementing user made suggestions. After releasing JS 8,<br />

JS 16, JS 32, JS 64 and JS 2000, IKS decided to rebuild the system completely by<br />

using modern hard- and software possibilities and the gathered knowledge from the<br />

JS development. This rebuild was published in 2003 as JS 2003. Beside the change<br />

of name to XiltriX and some minor improvements, this version is still current state.<br />

[Weerdesteyn06]<br />

3.1 XiltriX’s Technical Basis<br />

Figure 3-1 pictures a flowchart that introduces the general approach of most sensor<br />

based temperature monitoring systems including XiltriX. First step of monitoring is<br />

the collection of available data. Afterwards, this data is stored to a database for<br />

documentation purposes. As described in section 2.3 the current state of a monitored<br />

cooling device is only identified by comparing the current temperature to the<br />

predefined critical temperature limits. As long as the measured temperature is within<br />

the predefined limits the monitored device is classified as OK. Otherwise, the<br />

monitored device is classified as malfunctioning and a person in charge is notified.<br />

As monitoring is a continuous task in general, this procedure is repeated every time a<br />

predefined time interval is exceeded. This is indicated by the black dashed line.<br />

13 The companies <strong>Contell</strong> and IKS merged in January 2006<br />

19


Figure 3-1: Flowchart of the Temperature Monitoring Task<br />

Section 2.4.1 introduced the lack of information problem that causes many false<br />

alarms. Figure 3-1 illustrates that even parts of the known data remain unused for<br />

classification purposes. This is indicated by the green and red arrows. Although the<br />

whole available data is collected and stored, only the current temperature and the<br />

predefined critical temperature limits are used by XiltriX to determine the cooling<br />

20


device’s condition. Especially the stored historical temperature data is not used. Only<br />

the user has the possibility to analyze the collected data manually.<br />

The next sections will introduce the possibilities XiltriX is currently offering. This<br />

description is divided into three parts:<br />

1. XiltriX’s components (this section)<br />

2. XiltriX’s basic functionality (section 3.2)<br />

3. XiltriX’s additional features (section 3.3)<br />

3.1.1 Basic Components of a XiltriX Installation<br />

A basic XiltriX installation consists at least of a web server, one or more power<br />

supplies, one or more substations (called OS-4’s) and several temperature sensors<br />

(called PT100’s). Figure 3-2 pictures a schematic drawing of the connections<br />

between the single units of XiltriX.<br />

Although all parts are mandatory for a working XiltriX system, the web server is the<br />

most important one, because it contains the XiltriX software and stores the<br />

measurement data. The software is provided as a java applet. That means that no<br />

local installations are necessary. Every client just needs a web browser like the<br />

Microsoft Internet Explorer 6 and a connection to the local area network. In case of a<br />

web server’s breakdown, the whole XiltriX system will discontinue working.<br />

Also important are the OS-4’s. They are installed near the device that should be<br />

monitored. Every single of these substations offers the possibility to attach up to four<br />

sensors and up to four digital devices like switches, sirens or flashlights. 14<br />

Furthermore, it is possible to connect up to ten substations in a row.<br />

The connection between web server and substations is made by the use of the<br />

system’s power supplies. Each power supply is capable to energize five rows of<br />

substations with a maximum number of ten devices per row. Furthermore, the same<br />

cable is used to relay the measurement data from the connected OS-4’s to the web<br />

server, so that no additional cable is needed.<br />

14 See section 3.3.1 for further details<br />

21


Figure 3-2: XiltriX - Schematic Drawing of an Installation with Basic Components<br />

3.1.2 Other Installation Possibilities<br />

Typically, devices that should be monitored by XiltriX are spread all over the building.<br />

That is why a hardware installation of XiltriX is bound to a lot of wiring. Therefore,<br />

XiltriX offers two additional possibilities of connecting substations to the web server:<br />

1. Usage of an existing local area network<br />

2. Usage of a wireless connection<br />

Figure 3-2 demonstrates that normally only the web server is connected to the<br />

existing company network to publish the collected information. But this network can<br />

also be used to transport the measurement data directly from the substation to the<br />

web server. Therefore, it is necessary to convert the substation’s signal to a TCP/IP<br />

signal, which is compatible with a local area network signal. This could be done with<br />

22


a converter that is also available for XiltriX. Of course, a substation connected like<br />

this needs an own power supply because the network does not provide energy.<br />

The second additional possibility is the usage of a wireless LAN. Similar to the just<br />

introduced approach, the substation is equipped with an own power supply. The only<br />

difference is the way of sending the data to the web server. Instead of using a<br />

converter to use the local area network, an additional wireless LAN is installed. This<br />

method saves a lot of wiring, but it is less reliable than a cable connection, due to<br />

existing radio interferences within hospitals. [Weerdesteyn06]<br />

3.2 XiltriX’s Basic Functionality<br />

The last section focused on the general idea and the technical basis of XiltriX. This<br />

section will now introduce XiltriX’s basic functionality. This means, that these features<br />

are mandatory for a sensor based monitoring system and nothing unique. Section 3.3<br />

will introduce special features that were implemented to solve current limitations of<br />

the current monitoring approach.<br />

Figure 3-1 pictures the flow of information. A person in charge is notified in case of<br />

exceeding the predefined temperature limits. Beside that, information can be<br />

obtained from two additional sources as indicated by the dashed arrows:<br />

1. A display that shows current data<br />

2. The database that contains the historical temperature data<br />

XiltriX offers both possibilities. Figure 3-3 pictures the main screen of XiltriX. It gives<br />

an aggregate overview of current data of all monitored devices and can be accessed<br />

on every computer within the network. Most important is the white table in the middle<br />

of the screen because it contains machine based data. Depending on the system’s<br />

configuration this table shows current data from machines of one or more<br />

departments. The first column represents the status of an optional connected door<br />

sensor. Empty rows indicate a missing of this sensor.<br />

Furthermore, a unique identification number and a description are assigned to every<br />

monitored device, which is displayed in the second and the fourth column. The third<br />

column indicates the activation of the high resolution mode by showing an asterisk.<br />

23


This mode forces the system to store a measuring point every single minute instead<br />

of every 15 minutes. 15<br />

Column number five shows the last measured value. Depending on the classified<br />

current state of the attached device this value can be up to 15 minutes old within<br />

normal operation mode. To be able to classify such temperature values, critical limits<br />

are set, as described in section 2.3. These limits can be seen in column number<br />

seven and eight for every single device.<br />

Figure 3-3: XiltriX - The Main Screen [DEMO06]<br />

In Addition to these limits, a delay time can be defined for minimum and maximum<br />

temperature alarms in column six and nine. After having passed a critical<br />

temperature limit, the system is waiting for a predefined time, before it alarms the<br />

person in charge. The last two columns contain date, time and the most critical<br />

temperature value of a current alarm. Entries within these two columns can only be<br />

cleared by an alarm reset. 16<br />

15 See section 3.2.1 for details<br />

16 See section 3.2.2 for details<br />

24


To indicate important events within this table XiltriX uses a color code to highlight<br />

exceptional temperature values and alarm messages. The colors and their meanings<br />

are listed in Table 3-1.<br />

Table 3-1: Listing of Existing Table Color Codes and Their Meaning<br />

Color<br />

Meaning<br />

Orange Temperature exceeded the set minimum or maximum limit value or the<br />

door is open (within delay time)<br />

Blue Temperature exceeded the set minimum limit value<br />

(delay time has passed)<br />

Red Temperature exceeded the set maximum limit value<br />

(delay time has passed)<br />

Yellow An alarm has been canceled but not reset yet<br />

Purple An alarm has been reset but an activation delay is configured and active 17<br />

Below the just described table there is another smaller one, which offers information<br />

of digital input devices that can be bound but do not have to be bound to a single<br />

machine. Figure 3-3 shows an installed start/stop switch for fridge number 4. 18 This<br />

switch can be used to stop the monitoring of this device. In case of pushing the<br />

corresponding button, the monitoring of this device will be stopped. But at the same<br />

time an alarm would go off because the delay time for this device is set to zero. This<br />

can be seen in the DS column.<br />

Of course, a scenario like that does not make sense in practice, but this data is taken<br />

from the demo system. It is only made up for testing purposes and should only show<br />

the technical possibilities of XiltriX. In reality, a button like this could be useful, for<br />

instance, to disable alarms for cleaning purposes. Of course, a delay time higher<br />

than zero minutes is necessary. Other attachable switches and their functionality will<br />

be presented within section 3.3.1.<br />

Another important element of the main screen is the status bar above the just<br />

described tables because it offers an overview of the whole system. Monitored<br />

devices can be grouped into up to 16 different departments. The color of the<br />

corresponding department button indicates, whether every machine within this group<br />

is operating as expected or not. Table 3-2 gives an overview of the existing colors<br />

and their meanings.<br />

17 See section 3.2.2 for details<br />

18 See section 3.3.1 for details<br />

25


Table 3-2: Listing of Existing Status bar Color Codes and Their Meaning<br />

Color<br />

Meaning<br />

Grey Button is not in use or configured<br />

Green No alarms are activated<br />

Red An alarm has been activated within this section<br />

Yellow An alarm has been canceled but not reset yet<br />

Blue The SMS and/or E-Mail module is connected to XiltriX but turned off<br />

(only available for SMS and E-Mail button)<br />

The last six buttons offer additional information. “Tech” indicates a technical problem<br />

within XiltriX. This could be a broken cable to one of the sensors or one of the<br />

substations 19 for instance. “M1” and “M2” symbolize master alarm 1 and 2.<br />

Depending on the system’s configuration, it is possible to assign special kinds of<br />

serious failures to these buttons. 20 “Sys” reports system alarms. Consequently, it is<br />

similar to the technical alarm but it reports minor serious problems. In standard<br />

configuration it is not in use because no parts of less importance are attached to the<br />

system.<br />

The two remaining buttons are called “SMS” and “EMAIL”. They indicate the status of<br />

the optional available remote alert modules. A red colored SMS button, for example,<br />

indicates that an SMS was sent due to a just active alarm.<br />

All described elements together allow a very quick overview of the current system.<br />

Figure 3-3 on page 24, for instance, pictures a system within normal operation. All<br />

monitored devices are grouped to a single department, which does not report an<br />

alarm. “Tech” and “M1” also indicate a well running system within specifications. In<br />

addition to that, the system is capable of sending SMS and E-Mail. But the blue color<br />

indicates that in case of a malfunction these features will not be used because they<br />

are turned off within the configuration of XiltriX.<br />

3.2.1 Current Possibilities to Display and Analyze Stored Data<br />

As already pointed out in section 2.3 and section 3.2, the recorded data would allow<br />

extended data analysis. But currently available systems only offer basic manual<br />

analysis. The current version of XiltriX only offers the following possibilities:<br />

19 Substations are explained in section 3.1<br />

20 See section 3.3.3 for details<br />

26


1. Display stored data in table form<br />

2. Display stored data in graphical form<br />

3. Display basic statistical information<br />

Figure 3-4 pictures the first offered possibility to display stored data in numerical<br />

form. This table contains all available information of a selected device. The first two<br />

columns contain information of date and time of storage. This information is saved in<br />

local time (Central European Summer Time) and GMT (Greenwich Mean Time).<br />

Normally, one of these two columns would be enough. But as XiltriX is certified<br />

according to ISO 9001:2000, both columns are necessary.<br />

Columns three and four contain information of the measured temperature. “Raw<br />

value” is the raw digital measured value that is received by an attached sensor. This<br />

value is converted to Celsius scale and stored as “evaluated value”. The needed<br />

conversion factor is about 100:1. To identify the exact factor, a calibration is done<br />

regularly with every single sensor. Moreover, “lo” and “hi” contain the set critical<br />

temperature limits at storage time. In combination with the “evaluated value” it is<br />

possible to analyze in retrospect the number of alarms during a specified time period.<br />

Figure 3-4: XiltriX - Stored Data in Table Form [DEMO06]<br />

27


The other columns may remain empty because they offer information of additional<br />

attachable sensors and switches. If, for instance, a door sensor is installed, the<br />

accordant column will contain a Boolean value. “0” indicates a closed and “1”<br />

indicates an open door. As this section shall only give an overview of the stored data,<br />

the other possible switches will be explained within section 3.3.1.<br />

Looking at data in graphical form offers a much better overview of past behavior than<br />

numerical data. A comparison of Figure 3-4 and Figure 3-5 demonstrates this<br />

difference. Both of them contain the same dataset within the same time range, which<br />

can be chosen freely. The biggest problem of the table form is the limited number of<br />

values that can be displayed on one screen without using the scroll bar. The graph<br />

can be scaled to fit the screen, so that the whole behavior of the chosen time range<br />

can be seen immediately. This allows an evaluation of behavior of a cooling device<br />

within very short time. It is easy to see, that the exemplified fridge has a regular<br />

pattern with only very few outliers. This information would be hard to obtain without<br />

this visual help.<br />

Figure 3-5: XiltriX - Stored Data in Graphical Form [DEMO06]<br />

28


This way of visualizing data is the most often way, past time data is looked at<br />

[Weerdesteyn06]. Due to missing additional decision support, this is currently the<br />

only way “data analysis” can be done. In fact, the person in charge has to analyze<br />

the behavior of the different monitored devices by having a look at their graphs. In<br />

case of an uncommon behavior, it is necessary to look at this specific graph more<br />

frequently.<br />

The third way, data can be displayed by the use of XiltriX, is basic statistics. It offers<br />

additional information to determine the current condition of a monitored cooling<br />

device. Although statistical analysis is a powerful method to detect changes in<br />

behavior, the current approach is too simple, as described in the following. 21<br />

Figure 3-6: XiltriX – Available Statistical Information [DEMO06]<br />

Figure 3-6 presents the currently available statistical data. Again, the calculations<br />

refer to the same dataset as above. The single columns contain the channel number<br />

of the connected sensor as well as the minimum, the maximum and the average<br />

temperature value. Furthermore, the standard deviation, the number of occurred<br />

21 See section 5.10 for new approaches<br />

29


alarms and the mean kinetic temperature (MKT) is given. The calculation of these<br />

values is based on the stored measuring points.<br />

But these stored measuring points may contain irregular time ranges due to XiltriX’s<br />

storage behavior. In standard configuration, the system updates every measured<br />

temperature value once a minute. Furthermore, every 15 minutes a measuring point<br />

is stored on the web server. In case of a temperature exceeding of one of the<br />

monitored cooling devices, the saving behavior of XiltriX changes for this particular<br />

device. As long as the alarm is not reset 22 , a measuring point is stored every single<br />

minute.<br />

An installation of a door sensor results in additionally stored measuring points.<br />

Beside the regularly stored points, a measurement is added every time the status of<br />

a door sensor changes. If, for instance, the door of a fridge is opened five times<br />

within one minute, ten measuring points will be saved for this particular device. In<br />

contrast, it is possible that no single data point is stored within 14 minutes in case of<br />

no door opening and an uncritical temperature.<br />

Due to that irregular storage behavior of XiltriX, the computed statistical values are<br />

not implicitly correct, because temperature values are not weighted over time during<br />

calculation. Only the offered mean kinetic temperature considers the different time<br />

frames and provides a hundred percent correct results.<br />

3.2.2 Documentation of Occurred Alarms<br />

In addition to temperature data, events are also stored to the database. Divided into<br />

several log files, logins as well as configuration changes are documented. The most<br />

important log file contains information about occurred alarms and their reasons.<br />

As already indicated in section 3.2, every time<br />

an alarm goes off, it has to be acknowledged<br />

by a person in charge. This acknowledgement<br />

is done by an alarm reset. Figure 3-7 pictures<br />

the alarm documentation functionality. It offers<br />

the possibility to document the reason of an<br />

alarm as well as performed actions to solve the<br />

occurred problem. Beside some available<br />

presets, it is also possible to enter a reason as<br />

22 See section 3.2.2 for details<br />

Figure 3-7: XiltriX - Alarm<br />

Documentation Window [DEMO06]<br />

30


free text. Moreover it is possible to define an activation delay in minutes. Within that<br />

time no new alarm will go off.<br />

This stored information can be useful to evaluate the condition of a monitored cooling<br />

device. Many alarms due to open doors, for instance, indicate user misbehavior. By<br />

contrast, many alarms due to repair work or maintenance indicate unreliability of the<br />

monitored device. Because of the mentioned problem of unknown influences in<br />

section 2.4, this kind of documentation is very important to get at least some<br />

information of a freezers condition. A significant high number of repair and<br />

maintenance activities lead to the assumption that the monitored device has to be<br />

replaced.<br />

Unfortunately, this documentation possibility is rarely used in real practice. Because<br />

of the high quantity of false alarms, the employees get tired of documentation and<br />

just leave the input fields blank when resetting an alarm.<br />

3.3 XiltriX’s Additional Features<br />

Up to now, the introduction of XiltriX was focused on basic functionality that should<br />

be mandatory for every kind of sensor based temperature monitoring system. In the<br />

following, some additional features will be focused on, that were implemented to<br />

solve some of the existing problems. 23<br />

3.3.1 Different Types of Attachable Digital Switches<br />

One of these features is the opportunity to attach digital switches to XiltriX. These<br />

switches can be coupled to a certain monitored device but do not have to. In general,<br />

there are three configuration possibilities for every digital switch:<br />

1. A door switch<br />

2. A start/stop switch<br />

3. A high/low switch<br />

First of all, it can be configured as a door switch. Coupled to a certain monitored<br />

device, it signalizes every single door opening and closing to XiltriX. Beside the<br />

regularly stored measuring points, an additional value is written to the database in<br />

case of a switching. If a door switch is not coupled to a certain device, it can be used,<br />

for instance, to monitor the room door. If someone opens that door, this information<br />

23 See section 2.4 for details<br />

31


will be displayed at the bottom of the main screen. Depending on the configuration,<br />

an alarm could also go off.<br />

Aside from that, it is possible to configure a start/stop switch. This could be useful, for<br />

example, if regular maintenance work has to be done to certain monitored devices. If,<br />

for instance, a freezer is emptied and turned off, such a switch could stop monitoring<br />

or just suppress alarms.<br />

The third usage possibility is a high/low switch. It enables the person in charge to<br />

switch between two limit configurations. If, for example, a room door is opened, the<br />

chosen limits could be set to a higher span as long as the door is open. XiltriX offers<br />

additional ways to adapt limits to different kind of situations. This will be explained in<br />

the following section 3.3.2.<br />

3.3.2 Time-Dependent Limit Settings<br />

The just introduced high/low switch already offers the possibility to select one of two<br />

limit configurations by just pressing a button. But the determination of critical limits is<br />

still one of the greatest problems. Section 2.4.3 introduced the problem, of setting<br />

critical temperature limits. Especially, a lot of door openings and other unknown user<br />

behavior caused many false alarms. As already pointed out, there is currently no way<br />

in practice to solve this problem.<br />

That is, why XiltriX offers another workaround beside the high/low switch. This<br />

workaround is based on the assumption, that a monitored device is not faced 24/7<br />

with the same influences. Typically, there is a high quantity of external user<br />

influences like door openings within working time and only very few influences at<br />

night. For a scenario like that, XiltriX offers the possibility to set time dependent<br />

limits. Therefore, one of five evaluation functions can be chosen and configured.<br />

They are called:<br />

1. Permanent measuring value<br />

2. Day cycle<br />

3. Multiple day cycle<br />

4. Permanent measuring value with on/off recognition<br />

5. Multiple day cycle with on/off recognition<br />

The first evaluation function sets static limits and static delay times as explained in<br />

section 2.3. The second and third function offer the possibility to define time<br />

32


dependent limit and delay settings, so that it is possible to define limits with a higher<br />

span at daytime and limits with a lower span at nighttime. In addition to that, the<br />

multiple day function enables the person in charge to set different settings for<br />

different days of the week.<br />

Figure 3-8 exemplifies this possibility. The pictured configuration presents a limit<br />

setting of 2°C and 8°C within working time from Monday to Friday. During the<br />

residual time, limits of 2°C and 6°C are set. The activation delay defines the time that<br />

may elapse before the new limits have to be kept after switching.<br />

The last two offered evaluation functions combine the idea of a high/low switch and<br />

the idea of setting time dependent limits. Especially the last function offers the<br />

possibility to extend a configuration like the one below.<br />

Figure 3-8: XiltriX - Time Dependent Limit Settings [DEMO06]<br />

3.3.3 Alarm-, SMS- and E-Mail-Programs<br />

Another powerful additional feature of XiltriX is the vast quantity of notification options<br />

in case of a critical situation. Beside a message on the main screen, XiltriX offers the<br />

33


possibility to send several kinds of local and remote messages like SMS or E-Mail.<br />

Furthermore, additional local hardware like sirens or flashlights can be controlled. To<br />

enhance the value of these possibilities XiltriX offers alarm-, SMS- and E-Mailprograms<br />

to offer a comfortable way of configuration for every single monitored<br />

device.<br />

Alarm-programs enable the person in charge to define the alarming behavior of<br />

additional attached hardware. Up to eight different programs per department can be<br />

configured. Similar to the time dependent limit setting from section 3.3.2 an alarmprogram<br />

schedules different types of alarm relays. These relays have to be<br />

configured in advance. Figure 3-9 exemplifies a configuration for a locally installed<br />

flashlight.<br />

Figure 3-9: XiltriX - Setting up an Alarm Relay [DEMO06]<br />

Each configuration contains settings about the kind of alarm and one of the following<br />

three functions:<br />

1. On continuously<br />

2. On/off once<br />

3. On/off cyclically<br />

The first function activates the relay as long as an alarm is activated. An installed<br />

flashlight, for instance, would be turned on as long as the alarm is active. The second<br />

function also activates the corresponding relay as soon as a critical situation occurs,<br />

but it will deactivate the relay again after the expiration of a defined report duration<br />

time. This time can be defined between 1 and 99 seconds. Function number three is<br />

an extension to the just described one. It also deactivates the relay after the set<br />

report duration time, but activates it again after a set delay time as long as the alarm<br />

is reset. This delay time can range from 1 to 99 minutes.<br />

34


Aside from these functions, the kind of notified alarm can be influenced. As already<br />

explained in section 3.2, XiltriX is able to classify an alarm as technical or master<br />

alarm. A technical alarm indicates a malfunction in communication. A master alarm<br />

normally goes off, when there is no reaction to a prior alarm.<br />

Figure 3-9 introduces the configuration of an installed flashlight. As soon as an alarm,<br />

caused by the monitored device, goes off, the flashlight turns on for ten seconds,<br />

turns off for one minute and turns on again. In case of a technical or a master alarm 1<br />

or 2, the flashlight will also signalize the current situation as soon as the defined<br />

delay times between 5 and 15 minutes are exceeded.<br />

Beside the just introduced alarm-programs for local notification hardware, XiltriX also<br />

offers functionality to notify persons in charge by the mean of SMS or E-Mail.<br />

Therefore, additionally available modules have to be attached to the system.<br />

The notification via e-mail and SMS is quite simple. It is possible to configure a list<br />

with up to three responsible employees each. As soon as an alarm goes off, the first<br />

person will be notified. After a definable delay time, the second one will be notified in<br />

case of a still activated alarm. As long as no reset of an alarm is done, XiltriX will<br />

continue to notify these three people one after another. To allow an easy setup,<br />

XiltriX offers the possibility to create 16 configurations each. These configurations<br />

can be scheduled like the time dependent limits from section 3.3.2, so that the<br />

system always notifies the currently responsible employees.<br />

3.4 Review of XiltriX According to the Requirements Analysis<br />

The previous sections introduced basic and additional features as well as some<br />

technical basis of XiltriX. Especially the additional features intended to solve the<br />

existing methodological problems.<br />

Time dependent limits, for example, are able to reduce the number of alarms at<br />

daytime by choosing limits with a higher span. At night, the span can be reduced to<br />

achieve a lower probability of an error of second kind. 24 Furthermore, the system is<br />

not only able to alarm locally, but also by the use of E-Mail and SMS. Features like<br />

that are meant to improve the lack of information problem. 25<br />

24 See section 2.4 for details<br />

25 See section 2.4.1 for details<br />

35


But all these approaches are based either on the idea of adapting the limit settings or<br />

on the assumption that an immediate notification leads to a lower risk. In fact, XiltriX<br />

loses credibility due to a high quantity of false alarms. This leads to a higher risk<br />

because the probability of a not seriously taken alarm becomes very high.<br />

Table 3-3 contains the requirements that have to be kept by a sensor based<br />

temperature monitoring system 26 and XiltriX’s compliance. The current ability of<br />

XiltriX is limited to classify the current state by just evaluating actual temperature<br />

values. Furthermore, significant changes on the short-run may be detected, as they<br />

normally are bound to regular temperature exceeding. Hence, the first two<br />

requirements are partly fulfilled. All other requirements, including an avoidance of<br />

errors of second kind, cannot be satisfied.<br />

Table 3-3: Compliance of XiltriX According to the Requirements Analysis<br />

XiltriX is able to classify the current state of a monitored device<br />

XiltriX is able to recognize significant changes of behavior on the short-run<br />

XiltriX is able to recognize significant changes of behavior on the long-run<br />

XiltriX is able to predict upcoming failures<br />

XiltriX is able to identify failures as soon as they are recognizable<br />

XiltriX is able to avoid an error of second kind in any case<br />

XiltriX is able to recognize external influences<br />

This review demonstrates the need of finding better ways of data analysis within<br />

XiltriX to improve the current situation. The succeeding chapter 4 will point out the<br />

current state of research. Moreover, chapter 5 will present promising analyzing<br />

methods. But before, section 3.5 introduces briefly other major temperature<br />

monitoring products to point out other available approaches in the market.<br />

3.5 Other Major Monitoring Products in the Market<br />

As mentioned in section 2.3, three different types of sensor based temperature<br />

monitoring are current practice:<br />

26 See section 2.5 for details<br />

36


1. Temperature verification in retrospect<br />

2. Online comparison of current temperature values to a specified range<br />

3. Online comparison and data analysis in retrospect<br />

The following subsections will introduce major products, representing these types,<br />

and will review their compliance to the requirements. Moreover, differences to XiltriX<br />

will be pointed out.<br />

3.5.1 3M FreezeWatch and 3M MonitorMark Indicators<br />

The company 3M offers two very simple temperature monitoring solutions. They<br />

are called “3M FreezeWatch Indicator” and “3M MonitorMark Time Temperature<br />

Indicator” (pictured in Figure 3-10 and Figure 3-11). Both products are designed for<br />

short term temperature verification in retrospect, especially during shipment.<br />

Figure 3-10: 3M FreezeWatch<br />

Indicator [3M2006]<br />

Figure 3-11: 3M MonitorMark Time<br />

Temperature Indicator [3M2006]<br />

The FreezeWatch Indicator consists of an ampoule with an indication liquid inside.<br />

This ampoule is placed near the transported material during shipment. As soon as a<br />

critical temperature is reached, the indicator paper on the backside changes color<br />

forever. This enables the receiver to verify, whether a critical temperature was<br />

exceeded during shipment, or not. The FreezeWatch is available as a 0°C and as<br />

a -4°C version. [3M2006]<br />

The MonitorMark Time Temperature Indicator is based on the same idea. Beside<br />

the indication of a temperature exceeding, it offers additional basic information of<br />

duration or maximum temperature. As soon as the critical temperature is reached,<br />

the indicator paper starts to turn blue from the left to the right, the higher the<br />

37


temperature the faster the movement. Response cards are used to analyze the time<br />

temperature relation for each indicator paper. These cards are unable to define the<br />

highest really occurred temperature and the accurate duration, but offer a worst case<br />

scenario of the highest possible values. That is, because a long blue bar could be<br />

caused either by a lower critical temperature over a longer time period or by a high<br />

temperature over a short duration. [3M2006]<br />

The functionality of the just introduced products is very limited but easy to install<br />

during shipment. Due to missing alarming possibilities and missing online<br />

comparison of current temperature values, not a single requirement can be fulfilled<br />

by using these products. Hence, this approach is not suitable for lab equipment<br />

monitoring as the following Table 3-4 shows.<br />

Table 3-4: Compliance of 3M Indicators According to the Requirements Analysis<br />

Product is able to classify the current state of a monitored device<br />

Product is able to recognize significant changes of behavior on the short-run<br />

Product is able to recognize significant changes of behavior on the long-run<br />

Product is able to predict upcoming failures<br />

Product is able to identify failures as soon as they are recognizable<br />

Product is able to avoid an error of second kind in any case<br />

Product is able to recognize external influences<br />

3.5.2 2DI ThermaViewer<br />

The company 2DI offers the “ThermaViewer”. This instrument is intended to monitor<br />

single devices without the need of a PC. It is equipped with two sensors to measure<br />

temperature and humidity level. A Computer is not necessary because the<br />

ThermaViewer is capable of storing 44000 measuring points itself. The stored data<br />

can be displayed on its own display as pictured in Figure 3-12. [2DI2006]<br />

Figure 3-12: 2DI<br />

ThermaViewer [2DI2006]<br />

Beside basic display operations like zooming and<br />

scrolling, the device offers an interface to copy stored<br />

data to a Computer for archival purposes. Moreover,<br />

critical minimum and maximum temperature limits can<br />

be set. Similar to XiltriX, this device is able to indicate<br />

38


alarms by the use of additionally attached equipment like sirens, flashlights or dialers.<br />

[2DI2006]<br />

The ThermaViewer is a quite powerful solution for temperature monitoring purposes<br />

within small laboratories. But as it uses again the approach of setting critical<br />

temperature limits, it is faced with the same methodological problems like XiltriX. Due<br />

to the same approach and a similar implementation, the ThermaViever complies the<br />

same way with the requirements analysis than Xiltrix. This is illustrated in Table 3-5.<br />

Table 3-5: Compliance of the 2DI ThermaViewer According to the Requirements Analysis<br />

Product is able to classify the current state of a monitored device<br />

Product is able to recognize significant changes of behavior on the short-run<br />

Product is able to recognize significant changes of behavior on the long-run<br />

Product is able to predict upcoming failures<br />

Product is able to identify failures as soon as they are recognizable<br />

Product is able to avoid an error of second kind in any case<br />

Product is able to recognize external influences<br />

An additional problem of the ThermaViewer is the limited decentralized data storage,<br />

which disables extended data analysis on the long run, because only the last 44000<br />

measuring points are available (that is about 15 months, if a measuring point is<br />

stored every 15 minutes). But as this is not a methodological problem, but only a<br />

question of available memory, this problem can be neglected.<br />

3.5.3 Systems Offering Data Analysis in Retrospect<br />

Up to now, the introduced products within section 3.5 were kept relatively simple. In<br />

fact, most available systems in the market are kept as simple as the just introduced<br />

ones. The just mentioned ThermaViewer, for instance, is already one of the few<br />

higher developed systems, because it is able to store and display historical data.<br />

Most other monitoring products like the “Temperature Alarm System” from Triple Red<br />

just alarm in case of a temperature exceeding without offering additional information<br />

[Triple06].<br />

Only very few centralized temperature monitoring systems like XiltriX do exist in the<br />

market because most laboratories are still not aware of the danger of malfunctioning<br />

39


cooling devices and do not install expensive monitoring products like that. 27 Beside<br />

XiltriX, there are the following major systems in the market:<br />

1. Labguard 2 (AES Chemunex)<br />

2. FlashLink Wireless System (DeltaTRAK)<br />

3. Centron Environmental Monitoring System (Rees Scientific)<br />

The basic approach of Labguard 2 and the FlashLink Wireless System is very similar<br />

to XiltriX. Sensors are attached to cooling devices to get information of the current<br />

state and to alarm in case of an exceeding of predefined critical temperatures.<br />

Furthermore, the collected data is stored on a centralized web server and can be<br />

accessed from every connected client computer. [AES06], [DeltaTRAK06]<br />

The sensors of Labguard 2 and the FlashLink Wireless System do not need a<br />

substation or other wiring. They communicate directly with the web server by the<br />

exclusive use of radio signals. The FlashLink Wireless System is limited to 100<br />

sensors, which could monitor temperature or humidity. Labguard 2 is able to<br />

communicate with more different kinds of sensors, so that the system is also able to<br />

monitor pressure, CO 2 , O 2 and other substances. [AES06], [DeltaTRAK06]<br />

Also the bundled software does not differ much from each other. The FlashLink<br />

software offers basic functionality. It has to be installed on every machine that should<br />

gain access to the data. Besides setting critical alarm limits, it is possible to plot<br />

graphs from historical data and to alert a person in charge by the use of E-Mail or<br />

recorded voice calls. The Labguard 2 is very similar but it does only support remote<br />

notifications via E-Mail. On the other hand, it offers extended possibilities to display<br />

graphs including zooming, scrolling and a data export to Microsoft Excel. [AES06],<br />

[DeltaTRAK06]<br />

Compared to each other, XiltriX offers the best of the three presented software<br />

programs because it does not need a local installation on every client computer and<br />

offers additional features like the time dependent limit settings. But the general<br />

approach of just defining critical temperature limits remains the same. Hence, the<br />

review of the two just presented monitoring solutions and XiltriX according to the<br />

requirement analysis are alike. 28<br />

27 See section 2.1 for details<br />

28 See section 3.4 for details<br />

40


The third mentioned Centron Environmental Monitoring System is a building<br />

management system (BMS), which is designed to monitor the infrastructure of a<br />

whole building. Centron can be used to control access to labs or other important<br />

rooms. In addition to that, the system is able to save energy by turning off the lighting<br />

in empty rooms for instance. [Rees06]<br />

More important for this diploma thesis is the ability to monitor the temperature of<br />

cooling devices. Again, sensors are attached to freezers and the signal is transmitted<br />

to a web server. The possibilities to display and analyze historical data are very<br />

similar to XiltriX. Also the used technology is similar because historical data can also<br />

be accessed by a simple web browser. [Rees06]<br />

Figure 3-13 exemplifies the ability to display two graphs with different temperature<br />

scales. Drawing a graph like that is impossible with XiltriX because it is only capable<br />

of using one scale per axis. Another advantage of Centron is the possibility to<br />

integrate floor plans, so that in case of an alarm not only the name of a device is<br />

displayed but also the location. As Centron is not only a temperature monitoring<br />

system but a building management system, it offers the possibility to use nearly all<br />

kinds of connected devices to send a remote notification in case of a fridge’s<br />

malfunction. [Rees06]<br />

But minor differences like that do not provide a better way of data analysis because<br />

the general approach of just defining critical temperature limits remains the same.<br />

That is why Centron is faced with the same problems like all other introduced<br />

products in the market. Hence, other approaches have to be found to improve the<br />

current situation. As already pointed out in section 2.4.1, additional sensors could<br />

improve the lack of information partly. But installing additional hardware is coupled to<br />

increasing expenses. That is why this diploma thesis focuses on data analysis of the<br />

already recorded data to gain additional status information of monitored devices.<br />

This chapter introduced major available monitoring products in the market. Some<br />

systems are kept simple. Other systems offer many additional features. But a<br />

detailed analysis of all these products discovered, that they are all based on the<br />

same insufficient idea to set critical temperature limits.<br />

Hence, other approaches have to be found that do not only use the current<br />

temperature but also the stored past time data to determine a cooling device’s<br />

condition. Therefore, the next chapter will review the current state of research within<br />

the setting of sensor based temperature monitoring and other similar settings.<br />

41


Figure 3-13: Centron - A Sample Graph with Multiple Scales [Rees06]<br />

42


4 Current State of Research<br />

As already mentioned within this diploma thesis, there seems to be no research<br />

activity within this particular setting of sensor based temperature monitoring of<br />

cooling devices at the moment. That is why the described approach from section 2.3<br />

still seems to be state of the art.<br />

Hence this chapter will focus on similar fields of activity. The setting of machinery<br />

condition monitoring and the setting of measurement data analysis seem to be<br />

promising. Therefore, current research activities within these fields will be introduced<br />

and tested on applicability.<br />

4.1 Current State within the Setting of Sensor Based Temperature<br />

Monitoring<br />

The only found article, describing exactly this setting was written by H. Bonekamp in<br />

1997 and called “Monitor to guard fridge temperature”. This article points out the<br />

importance of temperature monitoring of fridges, even at home. Especially gone off<br />

food, due to too high temperatures within the fridge could be avoided by the use of<br />

temperature monitoring. Bonekamp suggests and describes the installation of a small<br />

sensor based temperature monitoring device. It only consists of three LEDs to<br />

indicate a low, a correct or a high current temperature. [Bonekamp97]<br />

As this approach is also based on the idea of just setting critical temperature limits to<br />

classify the current condition of a cooling device, other approaches have to be found.<br />

Therefore, the following subsections will introduce briefly common approaches and<br />

current state of research from the related settings of machinery condition monitoring<br />

and measurement data analysis.<br />

4.2 Current State within the Setting of Machinery Condition<br />

Monitoring<br />

Condition monitoring of industrial machinery is done for many different reasons.<br />

Common ones are ([Kolerus95], p. 3):<br />

• To avoid damage to machinery, employees or the environment<br />

• To avoid unexpected breakdowns of machinery<br />

• To do condition based maintenance<br />

• Quality control<br />

43


To achieve these and other aimed goals, several approaches of different conception<br />

levels do exist. In general, a decision between four different levels is made<br />

([Kolerus95], p. 4):<br />

1. Surveillance<br />

2. Early recognition of failures<br />

3. Failure diagnosis<br />

4. Trend analysis<br />

Surveillance is the most basic goal. Its only task is to recognize a just occurred<br />

malfunction and to react in a predefined way (e.g. raise an alarm, shut down the<br />

machine, etc.). The early recognition of failures should not only detect already<br />

occurred malfunctions but also slightly occurring misbehavior to allow a reaction in<br />

advance of a total breakdown. The last two levels shall predict upcoming<br />

malfunctions before they actually occur. The failure diagnosis is based on analysis of<br />

sensor data. The trend analysis extends this diagnosis by predicting the actual time<br />

the malfunction will happen.<br />

Looking at the aimed goals and the different conception levels of machinery condition<br />

monitoring indicates the similarity to sensor based temperature monitoring.<br />

Especially the last two conception levels are comparable to the aimed requirements<br />

from section 2.5. Hence, approaches of machinery condition monitoring have to be<br />

tested on applicability.<br />

Probably the most common way of machinery condition monitoring in practice is the<br />

usage of vibration analysis. Its basic idea is to obtain information of the current<br />

condition of a machine by measuring the vibration level from important moving parts.<br />

The vibration changes over time due to friction. This measured vibration is compared<br />

to a rated value to classify the current condition of the monitored parts. Depending on<br />

the kind of machine, different VDI guidelines do exist that describe critical vibration<br />

values (e.g. VDI 2056, VDI 2059, etc.). ([Kolerus95], p. 8-19)<br />

This kind of condition monitoring is quite easy to implement by just attaching vibration<br />

sensors to important parts. But due to a single sensor usage, this approach is also<br />

faced with a lack of information and not able to recognize external influences. 29 To<br />

reduce the probability of externally influenced results, a filter can be applied to the<br />

measuring data to cut off untypical frequency ranges (e.g. [Kolerus95], p. 22-29). But<br />

29 See section 2.4 for details<br />

44


even this improvement is still faced with a lot of problems that are very similar to the<br />

described ones in section 2.4 (see e.g. [Pitter01], p. 63-68).<br />

Hence an aimed goal of current research is to sensor additional measures to improve<br />

this type of machinery condition monitoring. Therefore, the research and<br />

development of sensors is mainly based on two ideas:<br />

• To create multi sensors<br />

• To create “intelligent” sensors<br />

Multi sensors allow a monitoring of different measures at the same time at the same<br />

place ([Krallmann05], p. 50). Main advantages are a saving of costs, higher reliability<br />

and a fusion of different kinds of measurements at one place ([Pitter01], p. 76). 30<br />

Intelligent sensors are based on mechatronics. The main idea of this research activity<br />

is to combine the fields of electronics, mechatronics and information processing<br />

([Piiter01], p. 27). This combination allows an interaction between mechanical and<br />

electronic parts in form of a control cycle. The approach offers new possibilities of<br />

monitoring and data analysis. 31 But as this diploma thesis shall be based on currently<br />

existing data there will be no focus on this activity.<br />

Beside an improvement of sensors, current research focuses also on knowledge<br />

driven approaches. The main idea is to combine measured data with additional<br />

knowledge of the underlying process (e.g. [Tröltzsch06], p. 10). As this additional<br />

knowledge is specific for certain settings, only general ideas of research activities<br />

from one setting could be applied to another one.<br />

In general two knowledge driven approaches can be figured out:<br />

• Knowledge models<br />

• Artificial neural networks<br />

A knowledge model is the most specific approach. It contains information of the<br />

underlying process. Hence, current measurement values can be classified in a better<br />

way. Often, this kind of model is combined with vibration analysis to determine<br />

friction for instance. In this case, a knowledge model could contain additional<br />

30 See e.g. [Krallmann05], [Pitter01] for details<br />

31 See e.g. [Pitter01]<br />

45


information of typical frictional behavior (e.g. linear friction vs. non linear friction).<br />

([Sick00], p. 5-7)<br />

Besides these specific knowledge models, there are artificial neural networks. These<br />

networks adopt the general functioning of a human brain. This means that an artificial<br />

neural network has to be trained with sample or historical data in advance, so that it<br />

is able to acquire knowledge. After this training, such a network is able to judge<br />

situations as regular or irregular like a human brain. ([Hagen97], p. 5-6)<br />

As this approach is able to learn on its own due to training data or past behavior, it is<br />

much more flexible, than a predefined knowledge model. 32 Both, knowledge models<br />

and trained artificial neural networks can be used as a knowledge basis for expert<br />

systems, which have the task to decide in an automated way (e.g. [Krems94],<br />

[Heuer97]).<br />

4.3 Current State within the Setting of Measurement Data Analysis<br />

The last section 4.2 already introduced the current state of research within the setting<br />

of machinery condition monitoring, which was faced with similar requirements, like<br />

sensor based temperature monitoring. This section will now focus on settings, in<br />

which analysis of time dependent data is used to detect changes and to predict<br />

upcoming behavior. The main focus lies on a generalized approach from Frank<br />

Daßler, which promises an early <strong>prediction</strong> of upcoming malfunctions without<br />

additional knowledge of the underlying setting ([Daßler95], p. 8).<br />

4.3.1 Basic Approaches<br />

Basic approaches are based on statistical methods. Descriptive statistical measures<br />

are used to get an aggregated overview of a datasets characterization (e.g. mean). In<br />

addition to that, these measures ease a comparison of different datasets (or different<br />

parts of a dataset) ([Eckey02], p. 41).<br />

Within many settings, time series analysis is applied to measurement data. Its main<br />

task is to discover structures and irregularities within a time sequence. By detecting<br />

structures, the time series analysis is not only able to describe regular behavior but<br />

also to predict the near future. 33 ([Chatfield04] p. 73-105)<br />

32 A detailed description of the functioning of an artificial neural network will be given in section 5.9.2.<br />

33 See section 5.5 for details<br />

46


Another current approach is regression. The main idea is to find a function that<br />

describes the temperature sequence best. A found function could be used for<br />

description as well as for <strong>prediction</strong> of future behavior ([Gentle02], p. 301). 34 Beside<br />

these presented approaches, artificial neural networks are applied again to gain<br />

additional knowledge (e.g. [Hawibowo97], p. 21-45). 35<br />

The succeeding chapter 5 will introduce the identified methods within this chapter<br />

and will test them on applicability to the setting of sensor based temperature<br />

monitoring. Before that, the next section introduces an approach that promises to be<br />

generalized. Hence, it should be applicable to the current problem.<br />

4.3.2 A Generalized Approach<br />

As already mentioned in section 4.3, Frank Daßler presents an approach that should<br />

be able to predict future measurement values without any knowledge of the<br />

underlying setting. The Main idea is to combine several known approaches to a new<br />

one. ([Daßler95], p. 7-8)<br />

The presented approach is based on the idea to solve the problem of just setting<br />

critical limits. According to the author, these existing methods lead to three problems:<br />

([Daßler95], p. 19)<br />

1. Just setting critical limits leads to sudden changes of current state. As long as<br />

a value does not exceed the predefined range, the state is classified as OK.<br />

2. In the moment of exceeding, an immediate reaction is necessary to solve<br />

dangerous situations.<br />

3. Chosen limits with a lower span to reduce situations of immediate danger lead<br />

to a higher quantity of false alarms due to outliers.<br />

Figure 4-1 illustrates the general proceeding of the new approach, which shall be<br />

able to solve the just mentioned problems:<br />

34 See section 5.4 for details<br />

35 See section 5.9.2 for details<br />

47


Figure 4-1: General Overview of the Generalized Approach ([Daßler95], p. 22) (adapted)<br />

The approach starts like every analyzing method with collection and storage of<br />

measurement data over time. As this approach is meant to be general, no<br />

requirements or restrictions are defined for this activity. ([Daßler95], p. 23)<br />

The biggest problem of analyzing measurement data is the influence of outliers on<br />

calculated results. According to the author, these outliers are evoked by technical<br />

disturbances. The following list contains some major causes: (Daßler95] p. 52)<br />

• Short-term measurement connection failures or short-circuits<br />

• Short-term sensor failures<br />

• Unstable working voltage<br />

• …<br />

As outliers are able to falsify calculated results, the elimination of these outliers is the<br />

first proposed step of measurement data analysis. A big problem is the recognition of<br />

outliers because nearly every measurement data is faced with noise, which causes<br />

small perceptible deviations. To identify outliers, the distance between succeeding<br />

measurement values is determined and compared to each other. These distances<br />

only vary marginal in case of a constant noise. To be able to classify a measurement<br />

value as an outlier a threshold value has to be found. The suggestion is to use the<br />

two times averaged distance between the succeeding measurement values as<br />

described by Formula 4-1. (Daßler95] p. 54)<br />

48


S<br />

o<br />

n<br />

2<br />

= ∑ Yi<br />

−Yi<br />

n −1<br />

i=<br />

2<br />

−1<br />

with :<br />

S<br />

o<br />

= Outlier threshold value<br />

Y = Measurement values<br />

n = Number of measurement values<br />

Formula 4-1: Threshold Value to Determine Potential Outliers<br />

But an ignorance of every measurement value with a higher distance than S o would<br />

lead to a neglecting of trends and other changes in behavior. That is why the number<br />

of outliers in a row is counted. Every possible outlier is set to the current mean value<br />

as long as less than three values in a row are classified to be outliers. In case of<br />

three or more values in a row, no further elimination will take place. (Daßler95] p. 54-<br />

55)<br />

This approach is able to cut off single outliers. The only disadvantage is a delay of<br />

trend recognition that is pictured in Figure 4-2. The green points represent the<br />

measured values with an existing change in trend. The red points illustrate the delay<br />

of trend recognition, because the first two higher values are classified as outliers and<br />

set to mean value. (Daßler95] p. 55)<br />

Figure 4-2: A Delayed Trend Recognition Due to Removal of "Outliers"<br />

After eliminating outliers, the measurement data is stored to a ring memory. This kind<br />

of memory has a fixed size. As soon as no sufficient memory is available to add an<br />

additional measuring point, the oldest value is overridden. This organization is used<br />

to avoid a high influence by too old values. The size of this ring memory is not<br />

49


accurately predetermined. Suggested is a size of 100 to 150 values. (Daßler95] p.<br />

25)<br />

Figure 4-2 pictures that the ring memory module communicates with three<br />

succeeding ones. First step of analysis is the curve selection. The basic idea is to<br />

describe the stored measurement values within the ring memory by finding a<br />

mathematical function (called regression). The curve selection module determines<br />

this function by using the method of least squares. 36 The acquired function is used to<br />

predict upcoming values by the succeeding module. (Daßler95] p. 25-26)<br />

These predicted values are used to recognize changes in trend. As soon as a<br />

change is recognized, the ring memory is cleared. This is especially important<br />

because old values that were stored before the change, falsify the results. An<br />

identification of changes is done by comparing actually measured values to their<br />

corresponding <strong>prediction</strong>s. In case of exceeding a certain threshold (see below), a<br />

new trend is assumed. (Daßler95] p. 58-60)<br />

The biggest problem of a reliable <strong>prediction</strong> is the already mentioned noise because<br />

a high noise could lead to an assumption of a new trend, although the behavior stays<br />

the same. That is why the noise has to be determined. The first step is a calculation<br />

of an envelope. This envelope normally includes all peaks. To exclude potential high<br />

peaks from envelopes, the following algorithm is used: (Daßler95] p. 61)<br />

1. Select the first five measurement values<br />

2. Determine ( f b<br />

) as a line of best fit for these values<br />

3. Calculate the distances between f<br />

b<br />

and measurement values<br />

4. Calculate the mean distance above the line ( d<br />

a<br />

)<br />

5. Calculate the mean distance below the line ( d<br />

b<br />

)<br />

6. Assign d<br />

a<br />

and d<br />

b<br />

to the measurement point right in the middle,<br />

Assign<br />

f ( X<br />

max<br />

) + d to the maximum value ( X<br />

max<br />

)<br />

b<br />

a<br />

Assign<br />

f<br />

X )<br />

b<br />

( X<br />

min<br />

) − db<br />

to the minimum value (<br />

min<br />

7. If end is not reached, deselect the first selected value, add the next one and<br />

go to 2.<br />

36 Section 5.4 contains a detailed description of regression and the method of least squares<br />

50


This algorithm returns an upper and a lower boundary of an envelope. These<br />

boundaries can be used to determine the noise by using Formula 4-2. Similar to the<br />

detection of outliers, this noise can be used to identify changes in trend. If the<br />

distance between measured and predicted value is higher than three times the<br />

calculated noise, a significant change in trend must have taken place. (Daßler95] p.<br />

64)<br />

n<br />

1<br />

N = ∑(<br />

E<br />

n<br />

i=1<br />

a<br />

i<br />

− E<br />

b<br />

i<br />

)<br />

with<br />

N = Noise<br />

n = Quantity of values<br />

E<br />

E<br />

a<br />

b<br />

= Upper boundary of envelope<br />

= Lower boundary of envelope<br />

Formula 4-2: Calculation of Noise<br />

Beside the recognition of changes in trend, the determination of the <strong>prediction</strong>'s<br />

probability is another important part of this approach. Four factors are identified to<br />

influence the probability of a correct <strong>prediction</strong>: (Daßler95] p. 74)<br />

• The quantity of measurement values<br />

• The noise<br />

• The curve stability<br />

• The <strong>prediction</strong> stability<br />

A small quantity of measurement values leads to a lack of information, as introduced<br />

in section 2.4.1. Also the high noise complicates an accurate <strong>prediction</strong> as just<br />

introduced. The last two factors are new but easy to see. The curve stability specifies<br />

the duration of time the currently used describing function did not change. The<br />

<strong>prediction</strong> stability offers a percentage of correct <strong>prediction</strong>s, since the last change in<br />

trend. Curve stability and <strong>prediction</strong> stability can be determined by using Formula 4-3<br />

and Formula 4-4: (Daßler95] p. 75-76)<br />

51


S<br />

c<br />

c<br />

= × 100%<br />

n<br />

with<br />

S<br />

c<br />

= Curve stability<br />

c = Quantity of measurement values that did not change the predicted curve<br />

n = Quantity of measurement values after last change in trend<br />

Formula 4-3: Calculation of Curve Stability<br />

S<br />

p<br />

C<br />

p<br />

= × 100%<br />

n<br />

with<br />

S<br />

C<br />

p<br />

p<br />

= Stability of <strong>prediction</strong><br />

= Counter for correct <strong>prediction</strong>s<br />

n = Quantity of measurement values after last change in trend<br />

Formula 4-4: Calculation of Prediction Stability<br />

The four just introduced criteria are used to calculate the <strong>prediction</strong>’s probability. But<br />

as the single values influence each other, a simple multiplication is not sufficient to<br />

calculate the total probability of correct predicted values. Also other methods that are<br />

based on fix limits of acceptable and unacceptable values are faced again with the<br />

already mentioned sudden change of state. That is why fuzzy logic is used for<br />

calculation. (Daßler95] p. 66-67)<br />

The main idea of fuzzy logic is not only to allow the answers 1 and 0 (as yes and no),<br />

but also values in between. This allows a better implementation of linguistic terms<br />

like “rather yes”. Hence, the total probability can be determined more precisely by<br />

using the four above mentioned factors. The presented approach uses the center of<br />

gravity method to calculate the results. But as this method is not relevant within this<br />

diploma thesis, it will not be presented here. 37<br />

The last step of this presented approach is the verification of predefined conditions,<br />

as pictured in Figure 4-1. This module evaluates the following three criteria, which<br />

have to be predefined by a person in charge: (Daßler95] p. 30)<br />

37 For details on fuzzy logic, see (e.g. [Turunen99], chapter 3-4; [Kosko99], chapter 1); for details on<br />

the actual implementation of fuzzy logic, consult (Daßler95] p. 79-89).<br />

52


• Critical values<br />

• Pre-warning time<br />

• Prediction probability<br />

Hence the presented approach allows predefinitions like for instance: “If the<br />

temperature will reach 10°C within the next 5 minutes with a probability of 90 percent,<br />

then send a trigger signal.”<br />

4.4 Review of Current State of Research<br />

The current chapter introduced different approaches from similar monitoring settings.<br />

Section 4.1 denoted the missing research activity within the setting of sensor based<br />

temperature monitoring. Section 4.2 pointed out the similarity to machinery condition<br />

monitoring. Basic methods like a comparison of current behavior to rated values can<br />

be found in both approaches. In addition to that, the machinery condition monitoring<br />

also uses knowledge driven approaches like artificial neural networks, which are not<br />

available within the setting of sensor based temperature monitoring. Hence, a test on<br />

applicability of this general idea has to be made. 38<br />

Section 4.3 introduced approaches from the setting of measurement data analysis.<br />

Basic approaches like descriptive statistical measures, time series analysis and<br />

regression also have to be tested on applicability. 39<br />

The introduced generalized approach from section 4.3.2 promises to be applicable to<br />

all kinds of measurement data without any knowledge of the underlying setting.<br />

Therefore, this approach is now reviewed according to the requirements analysis<br />

from section 2.5.<br />

First step was the elimination of outliers. This was based on the assumption that<br />

outliers are evoked by technical disturbances, which have to be ignored. An<br />

appliance to sensor based temperature monitoring could lead to an ignorance of high<br />

temperature peaks that are actually caused by door openings. But even if a change<br />

in trend is recognized, a delay of at least two time intervals will be caused. Hence,<br />

the approach is not able to identify upcoming failures as soon as they are<br />

recognizable.<br />

38 See chapter 5.9.3 for details<br />

39 See sections 5.3, 5.4 and 5.5 for details<br />

53


Another problem is caused by the ring memory. It was implemented to ignore old<br />

measurement values to avoid a high influence of these values. This allows<br />

recognition of significant changes of general behavior on the short-run but disables<br />

analysis on the long-run.<br />

The succeeding steps curve selection, calculation of <strong>prediction</strong>, recognition of<br />

changing trends and the determination of the <strong>prediction</strong>’s probability are faced with<br />

two big problems, which are caused by a missing ability to recognize external<br />

influences. First of all, every door opening would lead to an assumption of a change<br />

in general behavior, although it should be ignored, if a general change should be<br />

identified. Moreover, this approach is also not capable of predicting upcoming<br />

failures, because door openings cannot be distinguished from real malfunctions.<br />

Faced with these problems, the verification module is not able to offer more reliable<br />

information of a device’s current state, than the introduced method from section 2.3.<br />

Table 4-1 summarizes the just made review.<br />

Table 4-1: Compliance of the Generalized Approach According to the Requirements Analysis<br />

Approach is able to classify the current state of a monitored device<br />

Approach is able to recognize significant changes of behavior on the shortrun<br />

Approach is able to recognize significant changes of behavior on the long-run<br />

Approach is able to predict upcoming failures<br />

Approach is able to identify failures as soon as they are recognizable<br />

Approach is able to avoid an error of second kind in any case<br />

Approach is able to recognize external influences<br />

The introduced generalized approach seems to be applicable within many monitoring<br />

settings that suit the author’s assumptions. But especially the ignorance of outliers<br />

and the very low probability of a real technical malfunction 40 lead to a nonapplicability<br />

of this approach to the setting of sensor based temperature monitoring in<br />

practice. By contrast, single ideas, like the usage of regression and the other above<br />

mentioned approaches will be described and tested on applicability in the following<br />

chapter.<br />

40 See section 2.2.5 for details<br />

54


5 Possible and Promising Ways of Data Analysis<br />

The first four chapters already introduced the existing methodological problems of<br />

sensor based temperature monitoring systems and the current state of research. The<br />

second chapter pointed out the existing problems of the temperature monitoring task<br />

and limitations of the current approach of just setting critical temperature limits. The<br />

biggest problem was the existing lack of information. 41 This disabled an analyst to<br />

identify real causes of temperature deviations. Moreover, a very low probability of<br />

real malfunctions leads to many false alarms. 42<br />

The introduction of currently available temperature monitoring products in the third<br />

chapter pointed out that no solution seems to be available that bases on an other<br />

approach. Only some workarounds like time dependent limit settings are offered to<br />

solve the existing problems partially. 43 In fact, no introduced product did fully comply<br />

with the requirements from section 2.5.<br />

In addition to that, the fourth chapter pointed out that there seems to be no research<br />

activity within this particular setting of sensor based temperature monitoring of<br />

cooling devices within medical laboratories. That is the reason why this chapter tries<br />

to find ways to gain additional information of monitored devices by the use of<br />

statistical analysis and data mining. Due to missing specialized methods, the current<br />

research begins with an analysis of basic statistical and data mining methods. Aside<br />

from that, other specialized methods from the fourth chapter are introduced and<br />

tested on applicability.<br />

5.1 The Six Possible Levels of Data Analysis<br />

To be able to categorize different approaches of data analysis, it is important to<br />

review its possible kinds. Data analysis can be divided into six different levels of<br />

detail. Depending on the demands of the underlying setting, data analysis ranges<br />

from highly abstract to very detailed. According to the chosen level, one of the<br />

following kinds of results is aimed: ([Berthold99], p. 171)<br />

41 See section 2.4.1 for details<br />

42 See section 2.2.5 for details<br />

43 See section 3.3.2 for details<br />

55


1. Descriptive models<br />

2. Numerical models<br />

3. Graphical models<br />

4. Statistical models<br />

5. Functional models<br />

6. Analytic models<br />

Descriptive models represent the most abstract level of data analysis. They describe<br />

circumstances just by the use of verbal phrasing. ([Berthold99], p. 171) The sentence<br />

“The warmer the room ambient temperature, the higher the electric power<br />

consumption of a freezer”, for instance, already composes a small descriptive model.<br />

Although this kind of model does not give precise information of magnitude, it offers<br />

enough knowledge within many situations.<br />

By contrast, a descriptive model is not capable of solving the existing problems within<br />

the setting of sensor based temperature monitoring, because it is too abstract. A<br />

statement like “a cooling device is malfunctioning in case of reaching an uncommon<br />

temperature without being influenced externally” describes the problem very<br />

accurately. But it does not mention methods how to recognize these influences.<br />

Numerical models offer a more detailed description in form of a table. ([Berthold99],<br />

p. 171) Relating to the exemplified descriptive room ambient temperature model, an<br />

associated numerical model lists concrete room ambient temperatures versus the<br />

corresponding electric power consumption. The third level of data analysis is just the<br />

graphical representation of a numerical model. This is especially useful to get an<br />

overview of large datasets within very short time. 44<br />

The current methodological approach of predefining critical limits to classify the<br />

current state of a monitored system is a numerical model, because every<br />

temperature value is assigned to either “cooling device is OK” or “cooling device is<br />

malfunctioning” The graphical abilities of the introduced products, suffice to comply<br />

also with the third level of data analysis.<br />

The last three levels of data analysis are not based on verbal phrases or sample<br />

values but on mathematical “descriptions” for all values to get a higher detail.<br />

Statistical models use measures like, for instance, the mean temperature or the<br />

standard deviation to illustrate coherences ([Berthold99], p. 172). A sample model<br />

44 See section 3.2.1 for details<br />

56


could be: “The mean increase of a freezer’s electric power consumption is about 5%<br />

per degree room ambient temperature”.<br />

Functional models use functions to describe the existing behavior. ([Berthold99], p.<br />

172) Finding a functional description can be very difficult and is not always possible.<br />

A functional model of the freezer’s electric power consumption would allow a<br />

calculation for every given room temperature. Furthermore, it could help predicting<br />

malfunctions of a cooling device by just comparing current behavior to the describing<br />

function.<br />

Most powerful and detailed are analytic models. They describe coherences by the<br />

use of algebraic or differential equations. This allows a very detailed description of<br />

outputs for all kinds of imaginable inputs. ([Berthold99], p. 172) As already pointed<br />

out in section 2.4.1, many in- and outputs and their coherences are unknown due to<br />

very few sensors. That is why it seems to be impossible to find analytic models with<br />

the currently available datasets.<br />

Hence, this diploma thesis will first of all focus on possibilities to create statistical and<br />

functional models to gain more detailed information of monitored cooling devices.<br />

Only in case of reaching a degree of total information with this kind of models, an<br />

attempt to determine an analytic model would be useful.<br />

5.2 Different Kinds of Statistical Analysis<br />

The general purpose of statistical analysis is to provide information to advance<br />

important decisions. The main idea is to improve the quality of the decision making<br />

process by reducing uncertainties as good as possible. In general, statistics is<br />

divided into two branches: ([Holland01], p. 3)<br />

1. Descriptive statistics<br />

2. Inferential statistics<br />

Descriptive statistical methods describe large available datasets. Their main purpose<br />

is to summarize and to evaluate them. Another important task is the filtering of most<br />

important facts to get an overview of the underlying dataset. Typical results of<br />

descriptive statistical methods are statistical measures like the mean or the standard<br />

deviation for instance. The results are presented in form of a table or a graph to offer<br />

a quick overview. ([Holland01], p. 3)<br />

57


Inferential statistical methods do not describe available datasets but try to gain<br />

additional information from the existing data. These methods are applied to<br />

problems, where datasets cannot be obtained entirely ([Scharnbacher04], p. 43).<br />

After obtaining parts of the totality, inferential statistical methods are used to<br />

generalize gained results ([Bourier03], p. 3).<br />

Due to the fact that modern computer systems and mainframes are able to compute<br />

large amounts of data within very short time, statistical analysis offers additional<br />

calculation possibilities (e.g. data mining). Furthermore, the ability to collect data in<br />

an automated way increases the data base. As a result, the probability is higher that<br />

the gained generalized results are correct ([Eckey02], p. 3).<br />

5.3 Basic Descriptive Statistical Measures<br />

Section 1.2 defined the two main goals of this diploma thesis. The first one was to<br />

gain additional knowledge of the cooling device’s condition from recorded datasets to<br />

offer additional decision support in case of an exceptional temperature level.<br />

Therefore, a summarization and evaluation of available datasets by using descriptive<br />

statistics appears to be a promising approach. Hence, the succeeding subsections<br />

will introduce as well basic as special descriptive statistical measures from other<br />

already introduced monitoring settings. 45 Moreover, the expected gain of information<br />

is evaluated.<br />

Descriptive statistics offer some very common measures that can be applied easily to<br />

all kinds of numerical data. Most known are: ([Holland01], chapter 4)<br />

• Minimum and Maximum<br />

• The Mode<br />

• The Median<br />

• The Mean<br />

• The Standard Deviation<br />

The first basic descriptive statistical measurements are the minimum and the<br />

maximum value. Their determination can be done with very few time and effort.<br />

Nevertheless, these values can already indicate uncommon behavior. 46<br />

45 See chapter 4 for details<br />

46 See section 6.2.1 for details<br />

58


The mode is the most frequent value of a dataset. It can be regarded as a kind of<br />

center of a sorted dataset ([Eckey02], p. 42). Therefore, the mode is suitable to get a<br />

quick overview of the main behavior of large datasets. A disadvantage of the mode is<br />

the ignorance of outliers.<br />

The median is a value that divides a dataset into two parts of same extend. To<br />

calculate the median the single values of the given dataset have to be sorted by size<br />

to a row, so that X<br />

( 1)<br />

≤ X<br />

(2)<br />

≤K ≤ X<br />

( n)<br />

is complied. Afterwards, the value right in the<br />

middle is the median. In case of an even number of values, the mean of the two mid<br />

values is taken as described by the following Formula 5-1: ([Eckey02], p. 44)<br />

X<br />

⎧X<br />

(( n+<br />

⎪<br />

= ⎨1<br />

⎪ ( X<br />

⎩2<br />

1) / 2)<br />

( n / 2)<br />

+ X<br />

( n / 2+<br />

1)<br />

)<br />

if n odd<br />

if n even<br />

Formula 5-1: The Median Formula<br />

In case of a normal distribution, mode and median are very similar. This behavior<br />

changes, if the most frequent temperature value tends to one of the interval borders.<br />

Moreover, the median also ignores outliers.<br />

Probably the most common value in statistics is the mean. In fact, several different<br />

types of mean values do exist. Talking about the mean generally just denotes the<br />

arithmetic mean. It is calculated by just summarizing all values from a dataset and a<br />

subsequent division by the dataset’s quantity. Formula 5-2 describes this procedure<br />

in mathematical form. In contrast to mode and median, the arithmetic mean does not<br />

ignore outliers but weights every single value the same. ([Bourier03], p. 79)<br />

X<br />

1<br />

=<br />

n<br />

n<br />

∑ X i<br />

i=<br />

1<br />

Formula 5-2: The Arithmetic Mean Formula<br />

As already mentioned, there are several different mean values. Beside the already<br />

introduced arithmetic mean, there are the geometric and the harmonic mean. The<br />

geometric mean is especially used to analyze growth rates ([Bourier03], p. 84). The<br />

harmonic mean is defined to provide mean values of ratios ([Eckey02], p. 54). As<br />

monitoring data within the setting of sensor based temperature monitoring is neither<br />

faced with growth rates nor ratios these methods will not be presented.<br />

59


Another group of mean values are the weighted and moving ones. A weighted<br />

arithmetic mean, for instance, can be used to calculate a correct mean temperature,<br />

if the underlying dataset contains different time ranges. It is also possible to assign a<br />

higher importance to newer values (e.g. current outliers). The moving arithmetic<br />

mean always calculates a mean value by using the same number of values.<br />

Typically, the newest values are taken. As long as monitoring data is saved within<br />

constant time ranges, this method allows a calculation of mean values for a defined<br />

time span, e.g. the last three hours. Furthermore, it is also possible to add weighting<br />

to this kind of mean.<br />

Up to now, the presented statistical values just analyzed an average behavior. It was<br />

not possible to get further information of outliers. Therefore, the standard deviation is<br />

needed. It describes the mean variation of data values and is calculated by using the<br />

Formula 5-3. ([Eckey02], p. 71)<br />

σ =<br />

∑<br />

(<br />

X i<br />

n<br />

− X )<br />

2<br />

Formula 5-3: The Standard Deviation Formula<br />

This measure offers quite a lot of information in combination with the arithmetic<br />

mean. A low standard deviation indicates only slight changes around the mean value.<br />

By contrast, a high standard deviation indicates greater changes.<br />

Section 5.10.1 describes a promising approach, how these just presented basic<br />

statistical measures can be used to improve the current situation of insufficient<br />

information.<br />

5.4 Regression<br />

The general idea of regression is to describe a dataset of value pairs (x, y) by a<br />

functional model, as described in section 5.1 ([Gentle02], p. 301). Looking at time<br />

series data, regression tries to determine a functional model that describes the<br />

change of a value y over time x. Figure 5-1 pictures two examples of regression.<br />

60


Figure 5-1: Two Samples of Regression ([Bourier03], p. 167) (adapted)<br />

5.4.1 The Determination of Regression Functions<br />

A common approach to determine such a regression function is the method of least<br />

squares. This method is divided into three steps: ([Bourier03], p, 167)<br />

1. The determination of general trend from a graphical visualization or knowledge<br />

2. The assignment of this general trend to a mathematical type of function<br />

3. The numerical determination of the function’s parameters<br />

The first two steps are normally trivial and have to be done as initialization part. The<br />

third step has to determine the function’s parameters the way, the function describes<br />

the developing of values best. To do that, the distance between determined<br />

regression function and all available values has to be minimal, which leads to the<br />

method of least squares in Formula 5-4. The square is necessary to avoid illegal<br />

results. 47<br />

Min<br />

n<br />

∑<br />

i=<br />

1<br />

( y − yˆ<br />

)<br />

i<br />

i<br />

2<br />

with<br />

y = Occured value at time t = i<br />

i<br />

yˆ<br />

= Value of regression function at time t = i<br />

i<br />

Formula 5-4: Method of Least Squares<br />

Based on this method, a regression function can be determined for a given type of<br />

function. Often applied types are: ([Daßler95], p. 43)<br />

47 See ([Bourier03], p. 168-169) for details<br />

61


1. y ˆ = ax + b (Linear function)<br />

2.<br />

3.<br />

4.<br />

b<br />

y ˆ = ax<br />

(Exponential function)<br />

bx<br />

y ˆ = ae<br />

(Euler function)<br />

a<br />

yˆ =<br />

(Hyperbola)<br />

x + b<br />

5. y ˆ = aln(<br />

x)<br />

+ b (Logarithmic function)<br />

The assumption of a linear trend leads to the usage of y ˆ = ax + b as regression<br />

function. An appliance to the method of least squares leads to Formula 5-5:<br />

Min<br />

n<br />

∑<br />

i=<br />

1<br />

( y − b −<br />

i<br />

ax i<br />

)<br />

2<br />

Formula 5-5: Method of Least Squares for an Assumed Linear Trend<br />

To determine the parameters a and b it is necessary to partially differentiate Formula<br />

5-5 to these parameters. Afterwards, these equations have to be solved to a and b.<br />

Performing these two steps leads to the following optimal linear regression Formula<br />

5-6: ([Bourier03], p, 169-171)<br />

yˆ<br />

= ax + b<br />

with<br />

a =<br />

∑<br />

∑<br />

b = y − ax<br />

x y − nxy<br />

i i<br />

2<br />

xi<br />

− nx<br />

Formula 5-6: Regression Function for Describing Linear Trend<br />

2<br />

Other types of functions, like the mentioned one above, can also be used for<br />

regression purposes. The general idea stays the same, only the calculation steps<br />

vary with different functions. As other types of regression are not of interest within<br />

this diploma thesis, 48 they will not be regarded here. 49<br />

48 See section 5.10.2 for details<br />

49 More details can be found (e.g. [Eckey02], p. 171-184; [Bourier03], p. 172-179)<br />

62


5.4.2 The Major Problems of Regression<br />

Up to now, this section just introduced the approach of regression. The last part of<br />

this section will now review its two major problems: ([Eckey02], p. 179)<br />

• An incorrect chosen type of function leads to unacceptable results<br />

• Significant outliers influence the determination of a regression function<br />

The first problem can be solved partly by trying several types of functions.<br />

Afterwards, the best result can be selected. This is especially useful in cases of<br />

automated regression, where the general type of function may change. Problematic<br />

is the appliance of regression to purely random data, because a selection of a certain<br />

type of function might be impossible.<br />

The second problem could even be worse, because a correct type of function might<br />

lead to significant incorrect results, due to an influence of outliers. The two following<br />

figures exemplify this. Both, Figure 5-2 and Figure 5-3 contain a linear trend. But the<br />

obtained regression function for the first dataset is significantly wrong due to a single<br />

outlier.<br />

Figure 5-2: Incorrect Regression Function due to an Outlier ([Eckey02], p. 180) (adapted)<br />

Figure 5-3: Correct Regression Function ([Eckey02], p.180) (adapted)<br />

63


A graphical form like this allows an easy validation of the obtained regression<br />

function. But there is also a mathematical measure that offers a quality factor. It is<br />

called coefficient of determination and defined by Formula 5-7. The general idea is to<br />

2<br />

2<br />

split the total variance ( Var<br />

y<br />

) into the variance caused by regression (<br />

ŷ<br />

)<br />

2<br />

residual variance (<br />

u<br />

)<br />

Var and the<br />

Var . 50 ( )<br />

R<br />

2<br />

Var<br />

=<br />

Var<br />

2<br />

yˆ<br />

2<br />

y<br />

with<br />

Var<br />

Var<br />

Var<br />

2<br />

y<br />

2<br />

yˆ<br />

2<br />

u<br />

= Var<br />

1<br />

=<br />

n<br />

1<br />

=<br />

n<br />

2<br />

yˆ<br />

n<br />

∑<br />

i=<br />

1<br />

n<br />

∑<br />

i=<br />

1<br />

+ Var<br />

u<br />

i<br />

2<br />

i<br />

2<br />

u<br />

yˆ<br />

− y<br />

2<br />

Formula 5-7: Coefficient of Determination<br />

The coefficient of determination is a value between 0 and 1. If regression does not<br />

offer any additional information than the mean value, no variance is caused by<br />

regression. This leads to a coefficient of 0. By contrast, a coefficient of 1 would be<br />

caused by a regression variance that is of same magnitude like total variation.<br />

Hence, every occurred value is actually part of the determined regression function.<br />

([Eckey02], p. 181)<br />

In practice, a coefficient of determination of at least 0.8 is demanded. In case of time<br />

sequences, an even higher coefficient like 0.9 is demanded ([Eckey02], p. 181). A<br />

regression function with such a high coefficient can be used for <strong>prediction</strong> purposes<br />

by just calculating regression values for the near future.<br />

Section 5.10.2 will introduce a promising appliance of regression to determine a<br />

trend, which indicates a change in behavior. In contrast to that, <strong>prediction</strong> of an<br />

upcoming malfunction is not possible by using regressing, because every significant<br />

temperature rising would be predicted as upcoming malfunction. As most risings are<br />

caused by external influences and not by changes of general behavior, a usage of<br />

regression for <strong>prediction</strong> purposes would lead to at least the same high quantity of<br />

false alarms.<br />

50 See ([Eckey02], p. 180-181) for details<br />

64


The next section will introduce time series analysis. In contrast to regression, it is<br />

only limited to time series data, but offers more analyzing possibilities.<br />

5.5 Time Series Analysis<br />

“A time series is a collection of observations made sequentially through time”<br />

([Chatfield04], p. 1). The major idea of time series analysis is to decompose the<br />

variation of a time series graph into the four following components to obtain a<br />

structure: ([Chatfield04], p. 12)<br />

1. Trend (t)<br />

2. Seasonal variation (s)<br />

3. Other cyclic variation (o)<br />

4. Other irregular fluctuations (i)<br />

In some cases, trend and other cyclic variations are combined, so that only three<br />

components do exist (e.g. [Bourier03], p. 158). In the following, this diploma thesis<br />

will focus on the more common decomposition into four components.<br />

The first component could be defined as “long-term change in the mean level”<br />

([Chatfield04], p. 12). The greatest problem is the definition of “long-term”. Depending<br />

on the setting days could be meant as well as decades. The seasonal variation offers<br />

information about predictable recurring behavior (e.g. buying behavior at wintertime<br />

vs. buying behavior at summertime).<br />

Other cyclic variations are predictable as well but cover a smaller time span than the<br />

seasonal variations. For instance, buying behavior at daytime is higher than at<br />

nighttime. This could be described by cyclic variation. Behavior that cannot be<br />

explained with one of the just mentioned components has to be classified as other<br />

irregular fluctuations. These irregular fluctuations have to be kept small to get an<br />

expressive decomposition of a dataset’s variation. ([Chatfield04], p. 12)<br />

Figure 5-4 exemplifies a marketing time series. The seasonal variation is easy to see,<br />

because sales reach a maximum every winter and a minimum every summer.<br />

Moreover, a trend is recognizable, because every summer, a higher maximum and<br />

every winter a higher minimum is reached. After falling down in December, there is<br />

another small peak in January in most years. This could be classified as cyclic<br />

variation.<br />

65


Figure 5-4: Sales of an Industrial Heater [Chatfield04]<br />

A decomposition of a time series y = t , s , o , i ) like that allows a nearly complete<br />

t<br />

(<br />

t t t t<br />

description. Deviations are very small and have to be classified as other irregular<br />

fluctuations. A <strong>prediction</strong>, based on such a time series, leads to much better results<br />

than regression because the regular variations t, s, and o are taken into account.<br />

To be able to identify these components it is first of all necessary to define the<br />

interaction of the single components. In general, two models do exist. Formula 5-8<br />

pictures the additive one and Formula 5-9 pictures the multiplicative one.<br />

([Bourier03], p. 158-159)<br />

y = t + s + o + i<br />

t<br />

t<br />

Formula 5-8: The Additive Component Model<br />

t<br />

t<br />

t<br />

y<br />

t<br />

= t ⋅ s ⋅o<br />

⋅i<br />

t<br />

t<br />

t<br />

t<br />

Formula 5-9: The Multiplicative Component Model<br />

The first model is normally used, if cyclic variations with constant amplitude are<br />

assumed. By contrast, the multiplicative model is used, if cyclic components are on<br />

the increase over time. ([Eckey02], p. 188)<br />

The first step of a time series analysis is the identification of a possible trend. A very<br />

common approach to identify a trend is the usage of the least squares method, as<br />

66


explained in section 5.4. If such a trend is available, it can be removed from the<br />

existing data, so that the residual components can be determined. A time series<br />

without a trend is called stationary ([Chatfield04], p. 13). In fact, most methods<br />

require stationary time series data.<br />

After obtaining the trend, seasonal and other cyclic variations can be determined.<br />

This is done, for instance, by the use of the periodogram method. The general idea is<br />

to determine the distances to the trend function and to discover regular patterns. 51<br />

But as already mentioned in section 5.4.2, a <strong>prediction</strong> of an upcoming malfunction is<br />

not possible because every significant rise in temperature would lead to such a<br />

<strong>prediction</strong>. Moreover, the very low probability of a real malfunction 52 in combination<br />

with randomly occurring external influences have to be classified as irregular<br />

variations. Faced with these significant irregular variations, time series analysis is not<br />

able to offer additional improvements, compared to regression.<br />

5.6 Failure- and Availability Ratios<br />

Common within the settings of quality assurance and condition monitoring are<br />

operating ratios that specify the availability of systems and their tendency to run into<br />

failure.<br />

Common ratios to define this behavior are the “mean time to failure” (abbr. MTTF),<br />

the “mean time between failures” (abbr. MTBF) and the “mean time to repair” (abbr.<br />

MTTR). The first two measures characterize the average time a unit is working<br />

correctly before breaking down. The only difference between MTTF and MTBF is that<br />

the first one is used for parts that cannot or should not be repaired, but replaced. The<br />

second one is used for giving the average time between two necessary repairs of<br />

high value parts. The MTTR characterizes the average time the repairing takes.<br />

([Masing88], p. 113)<br />

These ratios can be used to specify the availability of systems by using the Formula<br />

5-10. This availability allows probability calculations, whether a system can be used<br />

during a specified time.<br />

51 See (e.g. [Bourier03], p. 180-189) for details<br />

52 See section 2.2.5 for details<br />

67


MTBF<br />

Availability =<br />

MTBF + MTTR<br />

Formula 5-10: The Definition of Availability [Masing88]<br />

The general idea, to calculate the estimated availability during a specified time,<br />

seems to be promising. But this method of failure- and availability ratios is faced with<br />

a major problem, when applying it to the setting of sensor based temperature<br />

monitoring. Most manufacturers of cooling devices do not offer ratios like MTTF<br />

[Nijmegen06]. As cooling devices are long-life products, a determination of these<br />

measures is also impossible. Hence, the appliance of failure- and availability ratios is<br />

not applicable within the setting of sensor based temperature monitoring.<br />

5.7 Markov Chains<br />

Another approach of predicting breakdowns is the usage of Markov chains. These<br />

chains are simple time-discrete stochastic processes (<br />

n<br />

)<br />

n N0<br />

X ∈<br />

with a countable state<br />

space I that comply with the following Formula 5-11 for all points in time n∈ N<br />

0<br />

and<br />

all states<br />

i ,K,<br />

i , i i ∈ I : ([Waldmann04], p. 11)<br />

0 n−1<br />

n,<br />

n+<br />

1<br />

P X i | X = i , , X = i , X = i ) = P(<br />

X = i | X = i<br />

(<br />

n+ 1<br />

=<br />

n+<br />

1 0 0<br />

K<br />

n−1<br />

n−1<br />

n n<br />

n+<br />

1 n+<br />

1 n n<br />

Formula 5-11: The Markov Property<br />

)<br />

This Markov property is the specific characteristic of Markov chains. It says that the<br />

probability for changing to another state is only influenced by the last observed state<br />

and not by prior ones. Hence, the probability that X<br />

n+ 1<br />

takes the value i<br />

n+ 1<br />

is only<br />

influenced by<br />

i n<br />

∈ I and not by i ,K in ∈ I . ([Waldmann04], p. 11)<br />

0<br />

,<br />

−1<br />

The conditional probability P ( X<br />

n + 1<br />

= in<br />

+ 1<br />

| X<br />

n<br />

= in<br />

) is called the processes’ transition<br />

probability. If this transition probability is independent from the point in time n , the<br />

Markov chain is called homogeneous. Otherwise it is called inhomogeneous<br />

([Waldmann04], p. 11). In the following, this thesis will first of all focus on<br />

homogeneous Markov chains. To improve the readability, they will just be named<br />

Markov chains.<br />

In the majority of cases the transition probability is written as a matrix P . It contains<br />

the probabilities p<br />

ij<br />

of all possible changes between old state i and new state j as<br />

68


pictured in Formula 5-12. Beside a change in state it is also possible that the state<br />

remains the same for another time interval. This probability is given by<br />

each column. ([Beichelt97], p. 146)<br />

p<br />

ii<br />

within<br />

⎛ p<br />

⎜<br />

⎜ p<br />

P = ⎜<br />

⎜ M<br />

⎜<br />

⎝ p<br />

00<br />

10<br />

i0<br />

p<br />

p<br />

M<br />

p<br />

01<br />

11<br />

i1<br />

L p<br />

L p<br />

L M<br />

L p<br />

0 j<br />

1 j<br />

ij<br />

⎞<br />

⎟<br />

⎟<br />

⎟<br />

⎟<br />

⎟<br />

⎠<br />

Formula 5-12: Transition Probability Matrix<br />

As every<br />

p represents a probability, they all have to comply with 0 ≤ p ≤1.<br />

ij<br />

Moreover, every state must have a succeeding state. Hence, the probability to take<br />

one of the available countable states as the next one has to be one hundred percent.<br />

This leads to the conditions pictured in Formula 5-13 for the transition probability<br />

matrix. ([Jondral02], p. 186-187)<br />

ij<br />

0 ≤<br />

p<br />

ij<br />

≤1<br />

∀i,<br />

j<br />

and<br />

N<br />

∑<br />

j=<br />

1<br />

p<br />

ij<br />

= 1<br />

∀i<br />

Formula 5-13: Conditions for the Transition Probability Matrix<br />

As the sum of every row within that Matrix has to be 1, p = 0 entries can be left out<br />

to offer a better overview. To achieve an even better overview, Markov chains are<br />

often visualized as a graph. Every node of that graph represents a possible state and<br />

every arrow a possible transition with a positive probability. ([Waldmann04], p. 17)<br />

ij<br />

A Markov chain can be used, for instance, to describe the following gamble between<br />

two people: A coin is thrown. Depending on which side is faced up, one of the two<br />

players wins the coin. Player one starts with four coins, player two with two coins.<br />

The game ends as soon as one of the players owns all six coins. This leads to seven<br />

possible states because a player can own every number of coins between zero and<br />

six. Provided that every coin has the same winning probability p the transition<br />

probability matrix would look like the one pictured in Formula 5-14. As described<br />

above, this Markov chain can be visualized as a graph to allow a better overview of<br />

the described process. A comparison of Formula 5-14 and the corresponding Figure<br />

5-5 shows this improvement. 53<br />

53 Example taken from ([Waldmann04], Chapter 2)<br />

69


⎛ 1<br />

⎜<br />

⎜1−<br />

p<br />

⎜ 0<br />

P = ⎜<br />

⎜ 0<br />

⎜<br />

⎜<br />

0<br />

⎝ 0<br />

0<br />

0<br />

1−<br />

p<br />

0<br />

0<br />

0<br />

0<br />

p<br />

0<br />

1−<br />

p<br />

0<br />

0<br />

0<br />

0<br />

p<br />

0<br />

1−<br />

p<br />

0<br />

0<br />

0<br />

0<br />

p<br />

0<br />

0<br />

0<br />

0<br />

0<br />

0<br />

p<br />

1<br />

⎞<br />

⎟<br />

⎟<br />

⎟<br />

⎟<br />

⎟<br />

⎟<br />

⎟<br />

⎠<br />

Formula 5-14: Sample Transition Probability Matrix<br />

Figure 5-5: Sample Transition Probability Graph<br />

Up to now, the transition probability matrix only made it possible to obtain the<br />

probability for a single change. But also important are probabilities of several<br />

changes in a row, as pictured in Formula 5-15. ([Beichelt97], p. 147)<br />

p<br />

( m)<br />

ij<br />

= P(<br />

X<br />

+<br />

= j | X = i)<br />

m = 1,2,...<br />

n<br />

m<br />

n<br />

Formula 5-15: Transition Probabilities of Several Changes in a Row<br />

(m)<br />

p<br />

ij<br />

symbolizes the probability that state i will change to state j after m steps.<br />

Apparently,<br />

p = is complied. The calculation of m > 1 can be done by using the<br />

(1)<br />

ij<br />

p ij<br />

formula of Chapman-Kolmogorov, which is pictured in Formula 5-16. ([Beichelt97], p.<br />

147)<br />

m<br />

( r ) ( m−r<br />

)<br />

pij = ∑ pik<br />

pkj<br />

r = 1,2, K , m −1<br />

k∈I<br />

Formula 5-16: Formula of Chapman-Kolmogorov<br />

Using the knowledge that one state of the countable state space has to be taken<br />

after r steps in combination with the knowledge of the total probability and the<br />

Markov property leads to an easy argumentation. 54 As a result the transition<br />

probability matrix for r steps is determined by multiplying r times the matrix by itself.<br />

This enables a simplified version of Formula 5-16. ([Beichelt97], p. 148)<br />

54 See ([Beichelt97], p. 147) for details<br />

70


P<br />

( m)<br />

= P<br />

( r )<br />

⋅ P<br />

( m−r<br />

)<br />

with<br />

P<br />

( m)<br />

=<br />

( m)<br />

( p ) m = 1,2, K<br />

ij<br />

Formula 5-17: Formula of Chapman-Kolmogorov (Simplified Version)<br />

This simplified Formula 5-17 allows an argumentation that shows that every Markov<br />

chain can be described completely by just giving a starting distribution at step 0 and<br />

a transition matrix. 55<br />

As mentioned above, a Markov chain is often used to predict breakdowns. Therefore,<br />

the existing states have to be classified as critical and uncritical ones. In general,<br />

states<br />

i ∈ I with p = 1 are critical ones. They are called absorbing states. Figure 5-5<br />

ii<br />

contains two absorbing states because after taking state 0 or 6 all following states<br />

will remain the same. Markov chains can now be used to determine the probability a<br />

critical state is taken. If an absorbing state is taken with a probability of one hundred<br />

percent, the mean number of steps can also be determined after which an absorbing<br />

state is taken. ([Waldmann04], p. 18)<br />

This determination can be done by calculating<br />

( )<br />

P m<br />

with m = 1,2, K,<br />

∞ . Markov chains<br />

often converge to a stationary distribution, so that the probability for an absorbing<br />

state can be given for an infinite number of state changes. Formula 5-18 introduces a<br />

counter-example that does not converge. Hence, the probability an absorbing state is<br />

taken during the whole processing time of the Markov chain cannot be obtained in<br />

any case but in many cases. ([Waldmann04], p. 40)<br />

⎛0<br />

P = ⎜<br />

⎝1<br />

1⎞<br />

⎟<br />

0⎠<br />

Formula 5-18: Identity Matrix as an Example of a non Converging Markov Chain<br />

The Markov property can also be transferred to the setting of time-continuous<br />

stochastic processes. The result is called Markov process. The biggest difference to<br />

the Markov chains is the non-applicability of the above described state probability<br />

calculations. The results can no longer be determined by just multiplying matrices but<br />

by solving differential equations. To ease calculations the underlying process is often<br />

55 See ([Beichelt97], p. 148) for details<br />

71


assumed to be asymptotic and only the stationary state is used for calculations. This<br />

leads to a linear system of equations that can be solved with less effort again. 56<br />

Both, Markov chains and Markov processes have become important analyzing<br />

methods within many different settings. As mentioned above, they can be used, for<br />

instance, to predict the time of first occurrence of a critical system state. Looking<br />

back to the setting of machinery condition monitoring from section 4.2 would allow to<br />

use Markov chains, for example, to predict upcoming malfunctions due to friction.<br />

([Waldmann04], p. 6-7)<br />

The Markov property seems to be promising also within the setting of sensor based<br />

temperature monitoring because a cooling device may malfunction at any time, no<br />

matter how long it worked fine before. But as already mentioned in section 2.2.5, a<br />

real technical malfunction has a very low unknown probability. Hence, starting<br />

distribution and transition matrix cannot be determined.<br />

5.8 Inferential Statistics<br />

In contrast to descriptive statistics, inferential approaches do not describe available<br />

datasets but try to generalize gained knowledge from existing data. These methods<br />

are applied to problems where data cannot be obtained entirely. The general idea is<br />

to analyze a representative sample of the statistical universe. But only in case of a<br />

really representative sample, the gained results can be applied correctly to the whole<br />

statistical universe. ([Eckey02], p. 242)<br />

The generalization of gained information is always bound to probability calculation.<br />

Hence, the general approach of inferential statistics is to determine the distribution of<br />

a representative sample. Afterwards, this distribution can be used to perform interval<br />

estimations, hypothesis testing or other similar methods. 57<br />

An application to sensor based temperature monitoring would require such a<br />

representative sample to determine the distribution. But in fact, such a representative<br />

sample does not exist, because of the randomness of external influences. This<br />

problem could partly be solved by applying monitoring data of a longer time period as<br />

representative sample to calculate the distribution.<br />

56 See ([Waldmann04], Chapter 4) for details<br />

57 See (e.g. [Scharnbacher04]) for details<br />

72


But the greatest problem is again the probability calculation, because a calculated<br />

low probability of a short-term malfunction could lead again to the assumption that<br />

the corresponding cooling device will not break down. 58<br />

5.9 Data Mining<br />

Data mining represents a special way of statistical data analysis. Its main purpose is<br />

to determine relationships between several items that were not recognized<br />

previously. Most often these relationships were not of primary interest at collection<br />

time. ([Gentle02], p. 123)<br />

To be able to apply data mining, many companies collect as much data as possible<br />

nowadays. In former times, check-outs in supermarkets, for instance, just<br />

summarized unit prices to calculate the final amount. Modern check-outs log every<br />

single product as well as other available data. Moreover, credit cards or discount<br />

cards allow a customer’s identification. ([Martin98], p. 249-250)<br />

These collected datasets can be analyzed by the use of data mining methods to gain<br />

additional knowledge. An aimed goal could be, for example, adapted sales promotion<br />

for different kinds of customers. Furthermore, the determination of the customer’s<br />

buying behavior could be of interest. A possible result could be that eighty percent of<br />

customers that buy beer do also buy potato chips. ([Lusti02], p. 262)<br />

5.9.1 General Fields of Application<br />

The just quoted examples already introduced some very common approaches. In<br />

general, data mining is divided into five fields of application: ([Lusti02], p. 262)<br />

1. Text mining<br />

2. Association rule mining<br />

3. Prediction<br />

4. Clustering<br />

5. Classification<br />

Text mining is the most basic field of application. Its purpose is to find patterns in text<br />

files for information retrieval. Therefore, special search algorithms have to be<br />

58 See also section 5.6<br />

73


implemented. These implementations are characterized by the type of text input and<br />

the estimated output. 59<br />

A very popular example for text mining is the automated collection of e-mail- and<br />

postal addresses from internet pages. The text mining algorithm has to identify these<br />

mentioned addresses as well as links to other pages to be able to continue<br />

searching. But as this data mining approach can only be applied to textual data, this<br />

diploma thesis will not go into further detail.<br />

Association rule mining is a multi criteria approach. Its purpose is the explorative<br />

discovery of dependencies between several items. The association rule mining is<br />

based on statistical correlation analysis. But as this is a multi criteria approach, it<br />

needs at least two different measures as input. 60<br />

The above mentioned example of beer and potato chips is a typical assignment but<br />

an appliance to temperature monitoring data does not seem to be promising,<br />

especially because the only possible information gain is a correlation between door<br />

openings and temperature behavior, which is generally known already. Hence, this<br />

data mining approach will also be left out within this diploma thesis.<br />

Prediction methods like regression and time series analysis are already introduced.<br />

The setting of data mining offers an additional approach, the so called artificial neural<br />

networks. These networks will be introduced and reviewed in the succeeding<br />

subsections.<br />

Clustering is an approach that scans large datasets and tries to identify different<br />

kinds of groups, which are previously unknown. Clustering is often used as a first<br />

step to apply other data mining methods to the identified groups ([Martin98], p. 269).<br />

The example of adapted sales promotion could be achieved by the use of clustering.<br />

Therefore, groups are determined automatically, that divide the customer’s behavior<br />

best (e.g. a separation by special interest or buying behavior) ([Lusti02], p. 261).<br />

Classification is similar to clustering. The main difference is the already existing<br />

knowledge of the classes (e.g. “creditworthy” vs. “not creditworthy”). An easy but<br />

basic approach is the so called rule induction. New rules are either created by<br />

experts or by an analysis of historical data ([Gentle02], p. 237-238). Automated<br />

59 See (e.g. [Multhaupt00], chapter 3-4) for details<br />

60 See (e.g. [Wittenberg98], p. 161-165) for details<br />

74


clustering as well as an analysis of historical data is often done by the use of artificial<br />

neural networks ([Blasig95], p. 3-4).<br />

The succeeding subsections will introduce artificial neural networks and will review<br />

their applicability to sensor based temperature monitoring data. A positive review<br />

would allow the usage of automated clustering and classification.<br />

5.9.2 Artificial Neural Networks<br />

Artificial neural networks are based on the functioning of a human brain. Every brain<br />

consists of neurons. These neurons are stimulated from neighbored neurons by<br />

chemical impulses, the so called neurotransmitters. Neurons transform incoming<br />

chemical impulses to electrochemical signals and relay them to the neighbored<br />

neurons. A regular exchange of these signals between two neurons leads to a high<br />

activation of this connection. By contrast, sparse communication leads to a low<br />

activation or even a loss of connection. ([Martin98], p. 262-263)<br />

The underlying basic principal is learning from failures. A connection that represents<br />

an error is assigned to a low activation level after recognition. By contrast, generally<br />

valid facts are represented by a highly activated connection. ([Lusti02], p. 316)<br />

Artificial neural networks adopt this functioning. They are defined by a tuple (N, V, F).<br />

N is a set of neurons. A neuron n i is defined by Formula 5-19. V and F represent a<br />

set of directed connections between neurons and a set of learning functions<br />

respectively, which are defined by Formula 5-20. ([Hagen97], p.6-7)<br />

n<br />

i<br />

= ( x(<br />

t),<br />

w ( t),<br />

a ( t),<br />

f , g,<br />

h)<br />

i<br />

i<br />

with<br />

x(<br />

t)<br />

= ( x ( t),<br />

K,<br />

x<br />

w ( t)<br />

= ( w<br />

a ( t)<br />

∈ R as activation level at time t<br />

i<br />

i<br />

h : R<br />

n<br />

1<br />

× R<br />

i1<br />

n<br />

( t),<br />

K,<br />

w<br />

( t))<br />

∈ R<br />

→ R with s ( t)<br />

= h(<br />

x(<br />

t),<br />

w ( t))<br />

as propagation<br />

g : R×<br />

R → R with a ( t)<br />

= g(<br />

s ( t),<br />

a ( t −1))<br />

as activation<br />

f : R → R with y ( t)<br />

=<br />

n<br />

i<br />

in<br />

i<br />

( t))<br />

∈ R<br />

i<br />

n<br />

as input vector at time t<br />

as weighting vector at time t<br />

f ( a ( t))<br />

as output<br />

i<br />

n<br />

i<br />

i<br />

i<br />

function to provide the input signal s ( t)<br />

function to calculate the activation level a ( t)<br />

function to calculate the output y ( t)<br />

i<br />

i<br />

i<br />

Formula 5-19: Definition of Neurons<br />

75


V ⊆ N × N is a set of directed connections ( n , n )<br />

{ F : n ∈ N}<br />

is a set of learning functions,<br />

which calculate new weightings<br />

w ( t ) = F ( W ( t ), y(<br />

t ), a(<br />

t ), d)<br />

i<br />

i<br />

2<br />

i<br />

i<br />

1<br />

1<br />

1<br />

i<br />

j<br />

for the neurons :<br />

with<br />

d = aimed output vector ( not necessary in case of<br />

W = weighting matrix<br />

y = output vector<br />

a = activation vector<br />

a selforganized network ( see below)<br />

Formula 5-20: Definition of V and F<br />

Figure 5-6 pictures the above given definition of an artificial neuron. Due to that<br />

functioning, an artificial neural network is similar to a Petri net but dynamic, because<br />

in- and output can vary over time.<br />

Figure 5-6: Functioning of an Artificial Neuron ([Hagen97], p. 8) (adapted)<br />

Just like a human brain, artificial neural networks have to learn. Within this<br />

initialization part, training data is applied to an untrained network to determine the<br />

weightings. These weightings will remain unchanged, if the initialization part is<br />

completed. In most cases, a representative part of the whole available data is taken<br />

as training data. ([Lusti02], p. 320-322)<br />

In general, two approaches of learning do exist:<br />

• Supervised learning<br />

• Unsupervised learning<br />

76


The general idea of supervised learning is a feedback of already known results. This<br />

means that during initialization not only inputs are provided but also aimed results.<br />

Hence, the neural network is able to adapt weightings to these aimed results.<br />

Offering these results could be done in two ways. First of all, historical data could be<br />

used that already contains results (e.g. a forecast done by the network can be<br />

evaluated by comparing it to the actually occurred value). The other possibility of<br />

supervised learning is the usage of a trainer. This trainer evaluates the results of<br />

training inputs and rates them. These ratings signalize the network, how weightings<br />

have to be changed. ([Heuer97], p. 16-17)<br />

Hence supervised learning is done by reacting on errors. A common learning<br />

approach is the usage of the delta rule. As described above, the neural network<br />

determines an output vector y to a given input vector x. Moreover, vector d must be<br />

given, which contains the aimed results. To be able to apply the delta rule, the<br />

magnitude of error has to be calculated by using the following Formula 5-21:<br />

([Hagen97], p. 22-23)<br />

δ = d − y<br />

i<br />

i<br />

i<br />

with<br />

δ<br />

i<br />

= Error<br />

d = Aimed result<br />

i<br />

y = Calculated Output<br />

i<br />

i =1,<br />

K,<br />

n<br />

Formula 5-21: Determination of Error<br />

As described above, this error is used to adapt the weightings between the single<br />

neurons. Formula 5-22 contains the often used delta rule that shall exemplify<br />

supervised learning.<br />

77


w ( t + 1) = w ( t)<br />

+ α ⋅(<br />

d ( t)<br />

− y ( t)<br />

⋅ x ( t)<br />

= w ( t)<br />

+ α ⋅δ<br />

( t)<br />

⋅ x ( t)<br />

ij<br />

ij<br />

i<br />

i<br />

j<br />

ij<br />

i<br />

j<br />

with<br />

w<br />

α > 0 = Learning rate<br />

δ = Error<br />

x = Given input<br />

d<br />

i<br />

i<br />

i<br />

= Weighting of connection<br />

= Aimed result<br />

y = Calculated Output<br />

i<br />

ij<br />

i = 1, K,<br />

n<br />

from n<br />

j<br />

to n<br />

i<br />

Formula 5-22: The Delta Rule<br />

Unsupervised learning has to be used, if only the question of data analysis but not<br />

the result is available. The general idea is again to train the network with sample<br />

patterns. But this time, the network has to find and evaluate structures itself. A<br />

needed requirement is redundancy within the input vector. The more redundancy, the<br />

better the training results because it allows the identification of noise and<br />

disturbances. ([Heuer97], p. 18; [Hagen97], p. 19)<br />

Most unsupervised approaches use Hebb learning. This principle is adopted from the<br />

human brain because the weighting of the connection between two active neurons<br />

increases. Hebb defined that the weighting is proportional to the product of the two<br />

neurons’ outputs. Formula 5-23 summarizes this approach. ([Hagen97], p. 20)<br />

w ( t + 1) = w ( t)<br />

+ α ⋅ y ( t)<br />

⋅ y ( t)<br />

ij<br />

ij<br />

i<br />

j<br />

with<br />

w<br />

= Weighting of connection<br />

α > 0 = Learning rate<br />

y = Calculated Output<br />

i<br />

ij<br />

from n<br />

j<br />

to n<br />

i<br />

Formula 5-23: Hebb Learning Rule<br />

5.9.3 Non-Applicability of Artificial Neural Networks to Current Datasets<br />

Although many different specific artificial neural networks do exist, they are always<br />

based on either supervised or unsupervised learning methods. An application of an<br />

unsupervised artificial neural network to currently obtained temperature data is not<br />

possible, because each input vector would only contain a time, a temperature and<br />

78


sometimes data of a door opening sensor. These three variables always contain a<br />

different kind of information, so that each input vector is free of redundancy. But as<br />

mentioned in the last section, redundancy is necessary to identify structures or<br />

patterns in case of unsupervised learning.<br />

By contrast, supervised artificial neural networks are used within other settings to<br />

classify the condition of a monitored device. But always certain preconditions have to<br />

be kept. A neural network is able to predict upcoming failures of pumps for instance.<br />

This is possible because a pump shows a nearly constant behavior. Typical slight<br />

changes in behavior over time that indicate upcoming malfunction could be learned<br />

by an artificial neural network, because every device behaves nearly the same way.<br />

(e.g. [Hawibowo97], chapter 5)<br />

At the moment, a problem of appliance is that the provided datasets from the UMC<br />

St. Radboud only cover the time range of about a year. Moreover, not a single<br />

technical malfunction occurred during that year, 61 so that this data would be<br />

insufficient to train an artificial neural network on the recognition of technical<br />

malfunctions.<br />

But even, if the datasets would contain some errors, a general problem is again the<br />

very small quantity of real malfunctions. 62 This could lead to a learning behavior that<br />

ignores malfunctions because of very low weightings of the corresponding edges<br />

within the network.<br />

5.10 Promising Analyzing Methods<br />

As neither the generalized approach from section 4.3.2 nor other approaches are<br />

directly applicable to the setting of sensor based temperature monitoring, this section<br />

will combine collected ideas to promising analyzing methods. Chapter 6 will apply<br />

these suggested approaches to data from the UMC St. Radboud and will review<br />

them according to the requirements analysis.<br />

5.10.1 Promising Appliance of Basic Descriptive Statistics<br />

Section 5.3 introduced the most common descriptive statistical measures. As the<br />

basic ones are applicable to all kinds of stored numerical data, this section will<br />

61 See section 2.2.5 for details<br />

62 See section 2.2.5 for details<br />

79


identify the probable information gain by using descriptive statistics to better comply<br />

with the determined requirements. 63<br />

The basic idea is to detect changes in general behavior by comparison of basic<br />

measures from different succeeding time intervals. This time interval can generally<br />

be chosen freely. As this diploma thesis assumes data of at least several months, a<br />

time interval of one day per value leads to meaningful results. 64<br />

The smaller a time interval is chosen, the higher is the influence of new<br />

measurement values. On the other hand, too small time intervals, like hours for<br />

instance, could lead to significantly deviating results, even if the behavior of the<br />

monitored device did not change. This problem is mostly caused by the random<br />

behavior of employees. Even a chosen time interval of one day per value can lead to<br />

deviating results, if the user behavior differs significantly. To exclude random user<br />

behavior, a division of the analyzing task into daytime- and nighttime data analysis is<br />

promising.<br />

The door sensor data can be used to define the daily daytime and nighttime intervals.<br />

Daytime starts with the first door opening and ends a defined time range after the last<br />

door opening. This allows, for instance, a comparison of nighttime or daytime mean<br />

values of different days. As the nighttime values are not influenced by random user<br />

behavior, they should be very similar. Variances in daytime values could indicate<br />

employee deviance.<br />

Based on this idea, the aimed goal to gain additional knowledge can be added to<br />

XiltriX (or of course as well to any other monitoring system) by implementing an<br />

automated notification service. This service should calculate and compare daily<br />

daytime and nighttime values of minimum, maximum, mean and standard deviation.<br />

Median and Mode values are not promising due to their ignorance of outliers. 65<br />

In case of a significant change of one of these values from one day to another, the<br />

person in charge should be notified. A classification as significant change should be<br />

done, as soon as the daily change is higher than delta times the regular changing<br />

behavior. To be able to define this behavior, historical data is needed. In general,<br />

data of a few days may suffice, but data of a longer time span probably assures the<br />

gained results.<br />

63 See section 2.5 for details<br />

64 See section 6.2.1 for details<br />

65 See section 5.3 for details<br />

80


Beside the introduced notification service, also a comparison of the door openings’<br />

quantity to other devices and a graphical distribution of the stored temperature<br />

sequence should be added. The analysis of door openings could be used to optimize<br />

the usage of cooling devices. If, for example, two freezers of the same type do exist,<br />

they should have about the same quantity of door openings in the mean. Otherwise,<br />

some often accessed contents should be stored to the device with less door<br />

openings to improve the cooling behavior of both freezers.<br />

The graphical distribution could be used to get brief information of the cooling<br />

device’s accuracy. A very small distribution with a high peak indicates a very<br />

accurate behavior with little deviation. By contrast, a broad deviation or several<br />

smaller peaks may indicate a very inaccurate behavior. 66 This suggested distribution<br />

should offer highly aggregate information to evaluate the general behavior of the<br />

corresponding cooling device at a glance.<br />

Section 5.10.4 will review probable improvements by applying basic statistics and<br />

consecutively introduced ideas to the setting of sensor based temperature<br />

monitoring. The case study in chapter 6 applies these ideas to a sample dataset to<br />

validate these estimated improvements.<br />

5.10.2 Detection of Changes in Behavior by the Use of Regression<br />

The last section introduced a promising possibility to detect changes in behavior on<br />

daily basis by the use of basic statistical measures. Also promising is the use of<br />

regression. Section 5.4.2 pointed out that regression can be used to describe a<br />

temperature sequence by determining a regression function. This possibility leads to<br />

two general ideas:<br />

1. The comparison of single cooling cycles to each other<br />

2. The determination of a trend in general behavior on the long run<br />

To compare single cooling cycles to each other, a polynomial regression function<br />

could be determined for a single representative cooling cycle. Afterwards, the<br />

coefficient of determination could be used to calculate the fit of other cycles to this<br />

regression function. 67 In case of a significant change in fit, a general change in<br />

behavior could be discovered.<br />

66 See section 6.2.1 for details<br />

67 See section 5.4.2 for details<br />

81


This idea would allow recognition of changes within very short time. Nevertheless, it<br />

is not promising because the single cooling cycles differ from each other due to<br />

technical and other reasons. 68 Hence, an appliance of this method would lead to a<br />

high quantity of additional false alarms.<br />

The second idea is based on the assumption that temperature sequences of cooling<br />

devices may not contain a trend on the long-run. To obtain a presumable trend, a<br />

linear regression function could be used. In case of a good fit (coefficient of<br />

correlation ≥ 0.9) 69 the gradient of the regression function determines the trend. As<br />

already mentioned, this method is only promising on the long-run, because linear<br />

regression functions on the short-run are highly influenced by outliers. 70 This would<br />

lead to a too small coefficient of determination.<br />

5.10.3 Classification by Using Past Behavior<br />

As a real malfunction is very improbable and the current number of alarms leads to a<br />

loss of credibility, more system states have to be made up than just “OK” and<br />

“Malfunctioning”. 71 Moreover, most alarms are user made due to door openings<br />

[Nijmegen06]. An appliance of data mining methods may provide these additional<br />

system states. The main idea is, to compare current behavior to similar situations in<br />

the past and the succeeding developing.<br />

Therefore, classification of alarms into different levels (e.g. green, yellow, red) could<br />

be used to indicate how critical a current temperature exceeding is. To achieve such<br />

a classification, every alarming situation is compared to all other situations in the<br />

past. The general assumption is that an alarm is only classified as a red one, if it is<br />

significantly different to most previous ones. Furthermore, alarms that cannot be<br />

connected to a previous door opening should immediately be classified as a red<br />

alarm. To be able to classify the door made alarms, different criteria have to be<br />

found.<br />

A promising suggestion is the usage of the following criteria:<br />

• The duration of door openings<br />

• The maximum temperature during an alarm<br />

• The maximum duration of an alarm<br />

68 See section 2.3 for details<br />

69 See section 5.4.2 for details<br />

70 See Figure 5-2 for details<br />

71 See section 4.3.2 for details<br />

82


A classification could be achieved by calculating the probability of the actual<br />

situation, based on historical data. The underlying assumption is that a situation that<br />

is more exceptional than any other in the past indicates a critical situation of a<br />

hundred percent probability. If, for instance, a current door opening already takes<br />

more than one minute, a comparison to past values could conclude that ninety<br />

percent of all door openings took less time. Hence, the probability that the current<br />

situation is critical is ninety percent. To put this value on a larger basis, the maximum<br />

probability of all three criteria should be taken.<br />

The probability of a critical situation could be used to define the above suggested<br />

alarm levels:<br />

• Green alarm: probability of a critical situation ≥ 50%<br />

• Yellow alarm: probability of a critical situation ≥ 75%<br />

• Red alarm: probability of a critical situation ≥ 95%<br />

This definition is only used exemplarily and can be adapted to other values.<br />

Especially the made assumption that probabilities < 50% shall not be classified as<br />

alarms, although the critical temperature level is exceeded, might be problematic in<br />

some settings. But this assumption saves a lot of alarms without increasing the risk<br />

too much. 72<br />

Using classification like this offers the person in charge additional operational<br />

decision support, whether an occurring alarm has to be taken seriously or not.<br />

Section 6.2.3 will apply this method to a sample dataset to point out the possible<br />

improvements.<br />

5.10.4 Review<br />

The last three sections introduced promising ideas to improve the current monitoring<br />

situation. This section will now review the expected improvements. Whether these<br />

methods really lead to the expected gain of information will be reviewed in chapter 6<br />

by applying them to a sample dataset.<br />

Section 5.10.1 pointed out, that the appliance of descriptive statistics to monitoring<br />

data might be used to recognize significant changes of general cooling behavior on<br />

daily basis. Especially the analysis of daily nighttime values could recognize changes<br />

72 See section 6.2.3 for details<br />

83


that are not caused by user interaction. This improvement would extend the currently<br />

limited possibility to recognize general changes on the short-run. 73<br />

In addition to that, the suggested analysis of door openings might offer the possibility<br />

to optimize the usage of cooling devices. Although this was not part of the<br />

requirements analysis, a relief of frequently opened devices could lead to a better<br />

cooling behavior and less alarms. Also the suggested graphical distribution cannot<br />

improve factors of the requirement analysis. But it could indicate the cooling device’s<br />

accuracy and behavior at a glance. This additional knowledge might support<br />

decisions in case of uncertainty of the cooling device’s general condition.<br />

Section 5.10.2 presented a promising way to discover a trend by the use of<br />

regression. Such a trend would notify a change in general behavior on the long-run.<br />

Hence, a combination of basic statistics and regression could lead to a limited ability<br />

to predict upcoming failures. In fact, a definite <strong>prediction</strong> is not possible due to the<br />

very low probability and the lack of information. 74 But a detected change may hint a<br />

person in charge to have a closer look at that corresponding cooling device.<br />

These estimated improvements can even be extended by using not only statistical<br />

analysis, but also data mining. The suggested classification from section 5.10.3<br />

establishes additional system states besides “OK” and “Malfunctioning”. These states<br />

could allow a more accurate description of the current system state because an<br />

alarm is rated. Based on these ratings, a person in charge could react to a<br />

temperature exceeding in a better way. Moreover, external influences are partly<br />

recognized because the classification of alarms also depends on the occurrence of<br />

door openings in advance of a temperature exceeding.<br />

Hence, a combination of statistical analysis and data mining is promising to<br />

significantly improve the current monitoring situation. The estimated improvements<br />

are summarized again in Table 5-1. Blue dashed arrows represent estimated<br />

improvements by the use of statistical analysis and magenta dashed arrows<br />

represent estimated improvements by the use of data mining.<br />

73 See section 6.3 for details<br />

74 See sections 2.4.1, 5.4.2 and 5.9.3 for details<br />

84


Table 5-1: Estimated Improvements<br />

Approach is able to classify the current state of a monitored<br />

device<br />

Approach is able to recognize significant changes of behavior<br />

on the short-run<br />

Approach is able to recognize significant changes of behavior<br />

on the long-run<br />

Approach is able to predict upcoming failures<br />

Approach is able to identify failures as soon as they are<br />

recognizable<br />

Approach is able to avoid an error of second kind in any case<br />

Approach is able to recognize external influences<br />

Approach is able to optimize the usage of<br />

cooling devices<br />

Approach offers a very quick overview of the<br />

cooling device’s accuracy<br />

The succeeding chapter 6 will present an appliance of the introduced methods to a<br />

selected sample dataset from the UMC St. Radboud to evaluate the real<br />

improvements in practice.<br />

85


6 Implementation and Case Study<br />

This chapter will apply the promising analyzing methods from section 5.10 to a<br />

selected dataset from the UMC St. Radboud. Therefore, section 6.1 will introduce the<br />

major problems and the actually found solutions to perform the data analysis task of<br />

the exported XiltriX data. Afterwards, the calculated results will be presented. A<br />

review of the information gain according to the just defined estimated improvements<br />

will conclude this chapter.<br />

6.1 Implementation of Promising Analyzing Methods<br />

Section 3.2.1 already introduced the stored data of XiltriX. The export functionality<br />

allows a storage of this data to disk as a comma separated value file (CSV). An<br />

excerpt of such a file is pictured in Figure 6-1.<br />

Figure 6-1: Exported XiltriX Data (An Excerpt)<br />

CSV is a standard file format and should be read by a large number of programs. In<br />

fact, the import of this data to other programs is problematic due to the following two<br />

reasons:<br />

• Programs failed to import the string based date and time correctly<br />

• Programs failed to manage the occurring large data sets<br />

Table 6-1 contains a listing of tested programs and the existing problems:<br />

Table 6-1: Import Problems of Tested Software Products<br />

Origin<br />

Euler<br />

Rt-Plot<br />

FreeMat<br />

MS Excel<br />

7.5<br />

2.4<br />

2.7<br />

2.0<br />

2002<br />

Product is able to manage<br />

the occurring large datasets<br />

Product is able to import date<br />

and time correctly<br />

86


As all these software products fail to offer a satisfying solution, Matlab is used in the<br />

following to implement the suggested methods. In fact, also Matlab has some<br />

problems to import the original datasets, but these problems can easily be solved by<br />

changing some delimiters in the CSV file. 75 Moreover, Matlab is capable of importing<br />

date and time correctly and is able to process very large datasets.<br />

In the following, the general ideas of the made implementation will be introduced.<br />

The technical realization and annotations to occurred problems, due to Matlab’s<br />

limited programming possibilities, can be found in the appendix.<br />

Caused by the storage behavior of XiltriX 76 the stored values contain different time<br />

intervals. An example of this behavior is pictured in Figure 6-1. But the suggested<br />

analyzing methods from section 5.10 assume constant time ranges. Hence, the first<br />

step of data analysis is an interpolation of the stored datasets.<br />

The basic idea is, to create new datasets of measurement values that contain regular<br />

time intervals. Door openings are stored to the beginning of the minute, in which they<br />

occur. A combination of the original and the interpolated datasets can be used to<br />

calculate the desired values, as described in the following, without making<br />

adaptations to the described methods. Certainly, in case of an implementation to<br />

XiltriX, its storage behavior should be adapted, so that interpolation is not necessary<br />

any more.<br />

After this interpolation, the desired statistical measures can be calculated. Therefore,<br />

the original and the interpolated datasets are divided into single days and these days<br />

again into daytime and nighttime. As described in section 5.10.1, daytime limits can<br />

be obtained by analyzing the door openings. Based on these made classes, the<br />

promising measures maximum, minimum, mean, standard deviation and the number<br />

of door openings can be calculated on the aimed basis of daytime, nighttime and<br />

whole day.<br />

To obtain correct results, the calculation of minimum and maximum values has to be<br />

based on the original data to avoid smoothing. By contrast, mean as well as standard<br />

deviation has to be based on the interpolated data to achieve a correct weighting in<br />

time. The determination of door openings can be based on both datasets, because<br />

the number of door openings remains unchanged after interpolation. The also aimed<br />

goal, to plot a temperature distribution, can be implemented easily by just counting<br />

75 See appendix for details<br />

76 See section 3.2.1 for details<br />

87


the number of occurrences of the single temperature values within the interpolated<br />

dataset. To allow advanced data analysis of all these calculated results, an export is<br />

done to Microsoft Excel files. Moreover, graphs are plotted and exported as TIF<br />

graphic files. 77<br />

Beside this statistical calculation, section 5.10.2 introduced the promising way of<br />

using linear regression. The functionality to calculate common kinds of regression<br />

functions is built-in to Matlab. Also the needed coefficient of determination can be<br />

obtained. 78 Hence, a self-made implementation for this kind of statistical analysis is<br />

not necessary.<br />

By contrast, the suggested data mining methods from section 5.10.3 have to be<br />

implemented again. The first step of the aimed classification is the identification of<br />

alarms. This is done by scanning the interpolated dataset on a temperature<br />

exceeding. As soon as such an exceeding is found, the next uncritical value is looked<br />

up. The time interval between these two values is classified as alarm. To determine<br />

just the alarms that were caused by a door opening, only those intervals are<br />

classified as an alarm that have a door opening within a predefined offset time. After<br />

the identification of the single alarm intervals, maximum alarm temperatures and the<br />

alarm durations can be collected to calculate the corresponding probability values.<br />

The calculation of door opening’s durations is more complex because the exact<br />

duration can only be obtained from the non interpolated datasets. Moreover, only<br />

door openings should be recognized that lead to an alarm. Hence, only the durations<br />

of alarms in offset time should be calculated and collected. To achieve that, the found<br />

door openings in offset time of alarms are looked up in the original data to determine<br />

their exact duration.<br />

This collected information of maximum temperatures, alarm durations and duration of<br />

door openings is used, afterwards, to determine the limits for the single classification<br />

classes, as exemplified in Table 6-5 on page 100. 79 As already mentioned, the<br />

technical realization and occurred technical problems during implementation time can<br />

be found in the appendix. This chapter will continue with the appliance of these ideas<br />

to a selected sample dataset from the UMC St. Radboud.<br />

77 See appendix for details<br />

78 See section 5.4.2 for details<br />

79 See section 6.2.2 for details<br />

88


6.2 Case Study<br />

The UMC St. Radboud provided 36 datasets. But none of them contains a real<br />

technical malfunction. Moreover, only ten of these datasets contain data of a<br />

connected door opening sensor. To be able to apply the suggested classification<br />

method, door sensor data is needed to determine, whether a temperature exceeding<br />

was caused by a door opening or not. Hence, only one out of these ten datasets can<br />

be chosen as sample dataset.<br />

Figure 6-2 pictures the actually selected dataset. It was selected because it contains<br />

several interesting factors. First of all, the set maximum temperature level was<br />

changed in March 2006, to reduce the quantity of false alarms (indicated by the red<br />

dashed line) [Nijmengen06]. Moreover, this temperature pattern contains eyecatching<br />

behavior. Beside some very high peaks, especially the global minimum,<br />

occurred September 22 nd , is eye-catching, because this behavior is unique within the<br />

whole time span. In addition to that, a change of cooling behavior of about half a<br />

degree in the mean took place on the long run.<br />

Figure 6-2: Temperature Overview of the Selected Sample Dataset<br />

Aside from these interesting factors, all door openings took place between 6 o’ clock<br />

in the morning and 10 o’ clock in the evening. This will be declared as daytime within<br />

89


this example. Hence, nighttime data of this chosen dataset is free from external user<br />

influences.<br />

6.2.1 Detection of Changes in Behavior by Using Descriptive Statistics<br />

Section 5.10.1 introduced a promising way of recognizing changes in behavior by just<br />

comparing basic descriptive statistical measures. This section will now present<br />

calculated results for the selected dataset. As described in section 5.10.1, the<br />

suggested notification service compares succeeding daily daytime and nighttime<br />

values and reports irregular ones. This irregularity was defined as delta times the<br />

mean change. To obtain a better feeling for this delta, the calculated results are first<br />

of all presented in graphical form. Afterwards, two different deltas are chosen and<br />

their corresponding notifications are calculated. 80<br />

The following figures 6-3 to 6-10 contain the calculated results of daily day- and<br />

nighttime values of the promising basic measurements. 81 To allow an easy<br />

comparison of daytime and nighttime values, they are plotted with the same scale.<br />

The results of the also suggested whole day analysis are not pictured because they<br />

do not differ significantly from the daytime results.<br />

The daytime maximum values are very irregular. In fact, they are almost comparable<br />

to the general temperature overview. But eye-catching is a very low maximum value<br />

at the end of November. The nighttime values offer a much clearer overview of the<br />

system’s general behavior due to the missing of external user influences. But again,<br />

the graph contains the eye-catching low temperature. As this exceptional value is<br />

also recognizable at nighttime, it should be reported to a person in charge. A closer<br />

look at Figure 6-2 discovers a real change of cooling behavior within that time span<br />

of nearly two days, so that this notification should be done.<br />

Beside this exceptional value the maximum nighttime values are faced with another<br />

significant change at the beginning of January. Eye-catching is, furthermore, the<br />

behavior at the end of January. Within very few days, the daily nighttime maximums<br />

increased about half a degree and nearly remained on that level for the rest of the<br />

monitoring time. This significant step also indicates a change in cooling behavior and<br />

should be notified.<br />

80 See Table 6-2 for details<br />

81 See section 5.10.1 for details<br />

90


Figure 6-3: Maximum Values at Daytime<br />

Figure 6-4: Maximum Values at Nighttime<br />

91


Daytime and nighttime minimum values are very similar to each other. The reason for<br />

this similarity is based on the kind of external user influences. Most influences are<br />

caused by door openings or the insertion of warm samples, so that the minimum<br />

temperature remains uninfluenced. 82<br />

The calculation of daily minimum values also contains several remarkable changes.<br />

First of all, daytime and nighttime values contain an exceptional change in minimum<br />

temperature of more than 1°C at the end of September. Regular changes from one<br />

day to another are normally at most 0.2°C, so that this change should be notified. In<br />

fact, this change in minimum temperature indicates the already introduced global<br />

minimum temperature that occurred on September, 22 nd .<br />

In addition to that, some other remarkable changes do exist. The first one is a rise in<br />

minimum temperature about one week, before the global minimum occurs. Moreover,<br />

the already mentioned change at the end of November is also eye-catching again. As<br />

maximum as well as minimum temperatures are remarkably different, a change of<br />

general temperature level must have taken place. The last eye-catching factor is that<br />

the calculated minimum values contain a general trend.<br />

The mean daytime and nighttime values appear to be very similar at first sight. But a<br />

closer look discovers some significant higher peaks in daytime data, which are not<br />

recognizable at nighttime. The reason for these peaks cannot be obtained for sure,<br />

but presumably they are caused again by external user influences like door<br />

openings.<br />

But even the uninfluenced nighttime values are faced with higher variations than the<br />

already reviewed minimum and maximum temperatures. This complicates the<br />

identification of significant changes. But again the changes from the end of<br />

September and November are recognizable. Moreover, the mean values also contain<br />

the mentioned trend.<br />

The last promising measure is the standard deviation. Again, the daytime values<br />

contain several high peaks that have to be traced back on door openings. By<br />

contrast, the graph of the standard deviation at nighttime indicates the changes from<br />

the end of September and November more clearly, than any other introduced<br />

measure.<br />

82 See section 2.4.2 for details<br />

92


Figure 6-5: Minimum Values at Daytime<br />

Figure 6-6: Minimum Values at Nighttime<br />

93


Figure 6-7: Mean Values at Daytime<br />

Figure 6-8: Mean Values at Nighttime<br />

94


Figure 6-9: Standard Deviation at Daytime<br />

Figure 6-10: Standard Deviation at Nighttime<br />

95


The visual analysis of these graphs discovered that the selected dataset contains<br />

several changes in behavior. Most significant are the global minimum on September,<br />

22 nd , and the change in temperature level on the end of November. As well eyecatching<br />

but not that significant are the mentioned changes right in the middle of<br />

September and the general rise in temperature in the year 2006. Looking at the<br />

graphs indicated, furthermore, that the notification of changes based on daytime data<br />

is not promising, due to the high number of external influences.<br />

After this graphical overview the numerical analysis will be evaluated by testing two<br />

different deltas. Meaningful results can be obtained by choosing five or ten as delta.<br />

The lower delta leads to earlier notifications, the higher delta only notifies higher<br />

deviations. Table 6-2 pictures the mean deviations in nighttime data as well as the<br />

minimum deviations that lead to a notification using the corresponding delta.<br />

Table 6-2: The Chosen Deltas<br />

Mean Deviation 5 x Mean Deviation 10 x Mean Deviation<br />

Maximum 0,027 0,135 0,27<br />

Minimum 0,043 0,215 0,43<br />

Mean 0,035 0,175 0,35<br />

Standard Deviation 0,011 0,055 0,11<br />

Table 6-3 pictures the calculated results. A yellow marked cell indicates a notification,<br />

due to a delta of five. If a delta of ten was predefined, only the red marked values<br />

would be notified.<br />

Using a delta of ten would have notified the most eye-catching changes in<br />

September and November. A delta of five would lead to 45 notifications. In fact, most<br />

of them are caused by the standard deviation and are not bound to significant<br />

changes in general behavior. Hence, the standard deviation should only be used with<br />

a high delta or left out. The residual measures can also be used with a delta of five.<br />

The made notifications in July, January and February can actually be traced back on<br />

small changes in cooling behavior, so that such a notification is right.<br />

Hence, comparing nighttime measures from succeeding time intervals enables the<br />

recognition of changes on the short-run (on daily basis). The notification level can be<br />

adapted by choosing a higher or a smaller delta. Presumably, different deltas for<br />

different kinds of machines have to be chosen to find the right balance between too<br />

many and too few notifications.<br />

96


Table 6-3: Reported Notifications (Based on Nighttime Data)<br />

Maximum Minimum Mean Standard Deviation<br />

15.06.2005 0 0 0,1 0,1<br />

18.07.2005 0 0,3 0 0<br />

19.07.2005 0 0 0,2 0,1<br />

21.07.2005 0 0 0,1 0,1<br />

23.07.2005 0 0 0,1 0,1<br />

24.07.2005 0,1 0 0,1 0,1<br />

03.08.2005 0,1 0 0 0,1<br />

04.08.2005 0 0,1 0 0,1<br />

10.08.2005 0 0 0,1 0,1<br />

19.08.2005 0,1 0 0 0,1<br />

20.08.2005 0,1 0 0 0,1<br />

25.08.2005 0 0 0 0,1<br />

26.08.2005 0 0 0 0,1<br />

30.08.2005 0 0,3 0,1 0,1<br />

20.09.2005 0 1,3 0,2 0,3<br />

21.09.2005 0 0 0,4 0<br />

22.09.2005 0,1 1 0,4 0,3<br />

26.09.2005 0 0,1 0 0,1<br />

27.09.2005 0 0,1 0 0,1<br />

24.10.2005 0 0,2 0,2 0<br />

28.11.2005 0 0,3 0,1 0,1<br />

29.11.2005 0,3 0 0,2 0,1<br />

30.11.2005 0,4 0 0,1 0,1<br />

01.12.2005 0,1 0,4 0,3 0,1<br />

05.01.2006 0,3 0 0,2 0<br />

06.01.2006 0,2 0 0,1 0,1<br />

10.01.2006 0 0 0,2 0<br />

11.01.2006 0 0 0 0,1<br />

12.01.2006 0 0 0,1 0,1<br />

24.01.2006 0,2 0,2 0,2 0<br />

26.01.2006 0 0 0 0,1<br />

27.01.2006 0 0 0 0,1<br />

01.02.2006 0 0 0 0,1<br />

07.02.2006 0,1 0,2 0,1 0,1<br />

23.02.2006 0,2 0 0,1 0,1<br />

24.02.2006 0,1 0,1 0,2 0<br />

03.03.2006 0 0 0,1 0,1<br />

08.03.2006 0 0,2 0,1 0,1<br />

11.03.2006 0,1 0 0,1 0,1<br />

12.03.2006 0,1 0,1 0,1 0,1<br />

15.03.2006 0,2 0 0,1 0<br />

01.05.2006 0,2 0 0,1 0<br />

11.05.2006 0 0,2 0,1 0,1<br />

17.05.2006 0,1 0,2 0,1 0,1<br />

23.05.2006 0,2 0 0,1 0<br />

97


Aside from determination of changes on the short-run, section 5.10.1 suggested to<br />

offer visualization possibilities for occurred door openings. Furthermore, a graphical<br />

temperature distribution was suggested to obtain the accuracy of a cooling device.<br />

Figure 6-11 pictures these additional ideas. The overview of door openings allows an<br />

easy comparison of usage to other devices. Moreover, the pictured distribution allows<br />

a very fast overview of the devices accuracy on the long-run: the sharper the peak,<br />

the higher the accuracy. Remarkable at this example is the second peak, which<br />

indicates the significant change in behavior.<br />

Figure 6-11: Daily Door Openings and Temperature Distribution of the Selected Dataset<br />

This section proved that already the simple appliance of basic statistical measures<br />

can discover changes in general behavior. Up to the calculation of these results, the<br />

corresponding cooling device was classified as well running. No one recognized<br />

these changes.<br />

6.2.2 Detection of Changes in Behavior by the Use of Regression<br />

Section 5.10.2 introduced the promising idea to detect changes in general behavior<br />

on the long-run by the use of regression. An appliance of linear regression to the<br />

selected dataset leads to the regression function from Formula 6-1. Remarkable is<br />

the high coefficient of determination, which indicates a very good approximation. 83<br />

Figure 6-12 offers a graphical representation.<br />

yˆ<br />

= 0.0019307x<br />

−1409.6<br />

R<br />

2 =<br />

0.97492<br />

Formula 6-1: Regression Function and Coefficient of Determination<br />

83 See 5.4.2 section for details<br />

98


Important for the determination of the trend is only the gradient of the determined<br />

function. The residual component of the regression function is evoked by Matlab’s<br />

internal representation of the date and can be ignored. This gradient has to be<br />

multiplied by the number of days. As the selected monitoring data contains a time<br />

span of 366 days, a trend of 0.0019307⋅<br />

366 ≈ 0. 7 °C is recognized on the long-run. A<br />

closer look at Figure 6-8 confirms this trend.<br />

Figure 6-12: Regression Function for the Selected Dataset<br />

6.2.3 Classification of Alarms by the Use of Historical Data<br />

Section 5.10.3 introduced the promising idea to classify alarms in case of a<br />

temperature exceeding by the use of historical data. The first introduced step was the<br />

determination, whether an alarm can be traced back to a door opening. Therefore, an<br />

offset time has to be defined, how long the last door opening may be dated back.<br />

Table 6-4 pictures different chosen offset times and the corresponding classification<br />

of alarms. 139 alarms occurred up to one minute, after a door was opened. A defined<br />

offset time of three minutes would lead to only two alarms that would immediately be<br />

classified as red ones and an offset time of ten minutes would lead to the result that<br />

all alarms were user made.<br />

99


Table 6-4: Classification of Alarms<br />

Selected Offset time (Minutes) Number of Alarms Caused by Door Openings<br />

1 139/158<br />

2 155/158<br />

3 156/158<br />

4 157/158<br />

10 158/158<br />

To classify these user made alarms, the suggested classes from section 5.10.3 are<br />

taken. Moreover the suggested criteria are used to determine the current condition.<br />

Table 6-5 contains the corresponding results of data analysis of the selected sample<br />

dataset. Due to the historical behavior, a green alarm will currently be raised in case<br />

of a door opening that takes at least 19 seconds because 50 percent took less time.<br />

Moreover, a temperature of 6.9°C or higher and a temperature exceeding of 7<br />

minutes or more would have the same effect. But as this data is calculated<br />

dynamically, these results only mirror a snapshot.<br />

Table 6-5: Results of Classification According to Single Criterions<br />

Duration of Door<br />

Openings<br />

Maximum Temperature<br />

During an Alarm<br />

Maximum Duration<br />

of an Alarm<br />

Green Alarm ≥ 19 Seconds ≥ 6,9°C ≥ 7 Minutes<br />

Yellow Alarm ≥ 38 Seconds ≥ 7,8 °C ≥ 12 Minutes<br />

Red Alarm ≥ 84 Seconds ≥ 9,8 °C ≥ 27 Minutes<br />

Occurred Red<br />

Alarms<br />

39 8<br />

42<br />

9<br />

Beside current limits for the single alarm classes, the last row of Table 6-5 is of<br />

special interest. It contains the number of red alarms that would have been raised<br />

during the whole monitoring time. This quantity of 42 red alarms is significantly<br />

different to 158, so that more than 60% of all occurred alarms could be classified as<br />

not that critical. But if this classification method is applied to very critical devices,<br />

additional conditions are needed. If, for instance, the temperature level may not<br />

exceed for more than 15 minutes, the red alarm should go off earlier.<br />

100


Remarkable is a comparison to the actually applied method of setting a higher<br />

maximum temperature limit, which was used to reduce the quantity of false alarms. 84<br />

Data analysis discovered that this set temperature limit was exceeded 152 times. In<br />

case of an unchanged temperature limit of 6°C, this number would have increased to<br />

158. Hence, this method saved 6 alarms but increased the notification delay in case<br />

of a real malfunction.<br />

6.3 Review<br />

Section 5.10.4 already pointed out the estimated improvements that might be<br />

achieved by using the suggested statistical and data mining methods. This section<br />

will review whether these estimated improvements really occurred.<br />

First of all, descriptive statistics led to the estimation that the currently limited ability<br />

to detect changes on the short-run may be improved. An appliance to the selected<br />

sample dataset showed that major changes in general cooling behavior were actually<br />

detected. Moreover, an adjustment to different security levels can be achieved by<br />

selecting bigger or smaller deltas. Only the appliance to daytime data does not<br />

provide reliable notifications, so that changes can only be recognized from morning<br />

to morning. But as this method recognized previously unknown irregularities, it<br />

definitely improves the recognition on the short-run.<br />

In addition to that, the appliance of regression provided very good results. The<br />

determined function had a very good fit ( R<br />

2 = 0. 97492 ) and contained a gradient that<br />

described the really occurred temperature increase well. Hence, as long as the<br />

monitoring data is not faced with too many influences that lead to a fit less than 0.9,<br />

this method is able to reliably detect changes in behavior on the long-run.<br />

Section 5.10.4 pointed out that the combination of basic statistics and regression<br />

could lead to a limited ability to predict upcoming failures. In fact, both methods<br />

detected changes in behavior, but the cooling device kept on functioning. Hence, the<br />

gained results could be an indication for an upcoming malfunction but do not have to<br />

be. Moreover, the optimization of the cooling device’s usage by analyzing door<br />

openings cannot be assured within this diploma thesis but has to be tested in<br />

practice.<br />

The appliance of data mining confirmed the estimated improvements. The gain of<br />

additional system states improved the previously limited possibility to classify the<br />

84 See introduction of chapter 6 for details<br />

101


current state of a monitoring device. As this classification also regards door<br />

openings, a limited possibility is achieved to recognize external influences.<br />

Hence, a combination of statistical analysis and data mining is able to significantly<br />

improve the current monitoring situation. The achieved improvements are<br />

summarized again in Table 6-6. Blue arrows represent achieved improvements by<br />

the use of statistical analysis and magenta arrows represent achieved improvements<br />

by the use of data mining. Blue dashed arrows represent estimated improvements by<br />

the use of statistical analysis that cannot be approved for sure, due to the named<br />

reasons.<br />

Table 6-6: Achieved Improvements<br />

Approach is able to classify the current state of a monitored<br />

device<br />

Approach is able to recognize significant changes of behavior<br />

on the short-run<br />

Approach is able to recognize significant changes of behavior<br />

on the long-run<br />

Approach is able to predict upcoming failures<br />

Approach is able to identify failures as soon as they are<br />

recognizable<br />

Approach is able to avoid an error of second kind in any case<br />

Approach is able to recognize external influences<br />

Approach is able to optimize the usage of<br />

cooling devices<br />

Approach offers a very quick overview of the<br />

cooling device’s accuracy<br />

6.4 Recommendations<br />

This diploma thesis pointed out the general problems of currently applied sensor<br />

based temperature monitoring. Most problematic was the very low probability of a<br />

real technical malfunction, compared to irregular temperatures that were caused by<br />

door openings or other external influences. Hence, the idea to just evaluate the<br />

current temperature of a cooling device leads to a large number of false alarms (e.g.<br />

158 for the selected sample dataset within one year)<br />

102


Interviews with several employees from the UMC St. Radboud discovered that<br />

currently no decision support does exist that introduces recommendations, telling<br />

what should be done in case of such an alarm. Not even information of time and<br />

duration of the last door opening is displayed to offer at least a hint on possible user<br />

influence. The only thing an employee can do in case of an alarm is to inspect the<br />

corresponding cooling device manually by having a short look at it.<br />

In fact, the very high quantity of false alarms led to a loss in credibility of XiltriX, so<br />

that employees tend to wait a certain time, after an alarm went off. Only in case of an<br />

enduring alarm for a longer time period or the occurrence of an uncommon high<br />

number of alarms during a short time interval, a manual inspection is really made in<br />

most cases. [Nijmegen06]<br />

As long as the stored contents are not damageable within very few minutes, this<br />

practice is doable. But the estimation, whether the developing of a current alarm is<br />

like most others or not, relies on experience and instinct of the operational staff. The<br />

suggested data mining method to classify the developing of an alarm into different<br />

alarming levels offers a higher reliability, because a decision, whether an alarm has<br />

to be classified as really critical, is based on all available information like door<br />

openings or past time behavior and not on unreliable user made estimations.<br />

Hence, in my point of view this classification method should be added to XiltriX to<br />

offer additional decision support and to reduce the number of demanded inspections.<br />

Highly critical devices may either be excluded from this classification or assigned with<br />

other classification parameters and additional conditions.<br />

<strong>Contell</strong>/IKS confirms the possible improvements but fears that this classification<br />

could lead to even higher user misbehavior, because classifications that are lower<br />

than the highest level might be ignored. Consequently, the current user behavior to<br />

wait for a certain time interval might be applied to the highest classification, so that a<br />

user reaction is delayed to an unacceptable level.<br />

The other major problem of sensor based temperature monitoring was the limited<br />

ability to recognize changes on the short- and long-run. Up to now, only changes are<br />

recognized that are bound to periodically occurring alarms. The suggested methods<br />

to use statistical analysis and regression to determine changes inside normal<br />

temperature range achieved a major improvement of this situation. Only the<br />

determination of an appropriate delta for different kinds of cooling devices still needs<br />

to be done in practice.<br />

103


This recognition of changes in behavior within the operating temperature interval is<br />

currently impossible with all introduced monitoring products, so that this feature<br />

would add a unique selling proposition. Because of the accurate results of these<br />

methods and the argument of gaining a unified selling proposition, <strong>Contell</strong>/IKS is<br />

interested in these methods.<br />

The additional suggested idea to optimize the usage of cooling devices by comparing<br />

the quantity of door openings to each other could not be tested on possible<br />

improvements within this diploma thesis, due to missing testing possibilities. But this<br />

method was also presented to <strong>Contell</strong>/IKS. The person in charge confirmed the<br />

possibility of improvement. But the main focus of <strong>Contell</strong>/IKS will first of all lie in the<br />

implementation of the introduced statistical methods to enable XiltriX to detect<br />

changes in general behavior within normal operation.<br />

Based on these facts, my recommendation is to implement the two statistical and the<br />

data mining method, because all three offer great results. Moreover, the optimization<br />

of the cooling device’s usage should be tested on applicability. But due to the<br />

concerns of user misbehavior, <strong>Contell</strong>/IKS will not focus on the presented data<br />

mining method.<br />

104


7 Summary<br />

Cooling devices within medical laboratories often contain irrecoverable samples that<br />

are part of research work. As a loss of such a sample could lead to a damage of half<br />

a million euro, a warming up of the cooling device’s contents has to be avoided in<br />

any case. Therefore, sensor based temperature monitoring systems are developed to<br />

notify a person in charge as soon as (or even before) a fridge starts to malfunction.<br />

The determination whether a cooling device is malfunctioning or not is currently just<br />

based on the definition of critical temperature values. But this approach causes many<br />

false alarms due to door openings and other external influences. Moreover, the<br />

measurement data is stored mainly for documentation purposes.<br />

The task of research was now to determine, what additional knowledge could be<br />

gained from the stored datasets by using statistics and data mining. Aimed results<br />

were a gaining of additional knowledge of a cooling device’s condition from recorded<br />

datasets to offer additional decision support in case of an exceptional temperature<br />

level. Moreover, a method to reliably predict upcoming malfunctions was aimed.<br />

The research started with an analysis of regular and irregular behavior of cooling<br />

devices. A major result was that every cooling device has a deviating temperature<br />

sequence, due to its technical functioning. Aside from that, the cooling behavior is<br />

disturbed by many environmental influences, mostly caused by user interaction. The<br />

last discovered problem is a lack of information that disables the finding of a heating<br />

up reason in most cases.<br />

The third chapter reviewed XiltriX and other available major monitoring systems.<br />

Some of these systems were kept very simple. Other systems offered many<br />

additional features. But a detailed analysis of all these products discovered, that all of<br />

them were based on the insufficient idea to just set critical temperature limits.<br />

The fourth chapter reviewed the current state of research. As a current research<br />

activity within this setting of sensor based temperature monitoring could not be<br />

discovered, the main focus was kept on the similar settings of machinery condition<br />

monitoring and the measurement data analysis. Remarkable is the introduction of a<br />

generalized data analysis approach that promised to predict future values of all kinds<br />

of measurement data without any knowledge of the underlying setting. But due to the<br />

high quantity of external influences an appliance of this approach failed.<br />

105


Hence, chapter 5 reviewed other promising approaches from chapter 4 on<br />

applicability. Especially the promising appliance of time series analysis and artificial<br />

neural networks failed, mainly because of the very low probability of a real<br />

malfunction and the missing of training data that contains malfunctions.<br />

By contrast, three methods were identified to improve the current monitoring<br />

situation. The first one is based on the statistical measures minimum, maximum,<br />

mean and standard deviation. The basic idea is to detect changes in general cooling<br />

behavior by comparison of these measures from succeeding time intervals. As soon<br />

as a change is significantly higher than average, the user should be notified of this<br />

change on the short-run. To avoid too many false notifications, only the uninfluenced<br />

nighttime data is used, which can be determined by the use of a door opening<br />

sensor.<br />

Moreover, linear regression could be used to determine a trend on the long-run.<br />

Although the temperature data is not linear, the achieved fit is sufficient to get reliable<br />

results. The last identified method is based on data mining. The general idea is, to<br />

compare current behavior to similar situations in the past and the succeeding<br />

developing. This enables a classification into different alarming levels.<br />

Chapter 6 introduced the implementation of the identified methods to Matlab and<br />

applies it to a selected sample dataset from the UMC St. Radboud (University<br />

hospital of Nijmegen, the Netherlands). As a result, the combination of statistical<br />

analysis and data mining is able to significantly improve the current monitoring<br />

situation. Changes in behavior on the short-run can be discovered, by comparing<br />

daily statistical measures. Moreover, regression can be used to determine changes<br />

in cooling behavior on the long-run.<br />

Using the suggested classification, leads to the gained additional decision support in<br />

case of a temperature exceeding. Only the aimed goal, to reliably predict upcoming<br />

failures can not be achieved because of unrecognizable external influences and the<br />

very low probability of a real technical malfunction. But the recognition of changes in<br />

cooling behavior might hint an upcoming malfunction automatically, so that also this<br />

aimed goal is at least partly reached.<br />

106


Bibliography<br />

Books and Articles:<br />

[Beichelt97] Frank Beichelt, Stochastische Prozesse für Ingenieure, B.G.<br />

Teubner, Stuttgart, 1. Edition, 1997<br />

[Benker01] Hans Benker, Statistik mit Mathcad und Matlab, Springer Verlag,<br />

Berlin, 1. Edition, 2001<br />

[Berthold99] Michael Berthold & David J. Hand, Intelligent Data Analysis,<br />

Springer Verlag, Berlin, 1. Edition, 1999<br />

[Blasig95] Reinhard Blasig, Neuronale Netze und die Induktion<br />

symbolischer Klassifikationsregeln, Dissertation, Universität<br />

Kaiserslautern, 1995<br />

[Bohnekamp97] H. Bonekamp, Monitor to Guard Fridge Temperature, In: Elektor<br />

Electronics, Canterbury: Elektro Publ. Ltd, ISSN 0308-308X, 23,<br />

p. 58-61, 1997<br />

[Bourier03] Günther Bourier, Beschreibende Statistik, Gabler Verlag,<br />

Wiesbaden, 5. Edition, 2003<br />

[Chatfield04] Chris Chatfield, The Analysis of Time Series, Chapman &<br />

Hall/CRC, Boca Raton (Florida), Sixth Edition, 2004<br />

[Daßler95] Frank Daßler, Tendenztriggerung – Meßdatenanalyse im on-line-<br />

Betrieb mit dem Ziel der frühzeitigen Erkennung und Vorhersage<br />

von Daten, Trends und Störungen, Dissertation, TU Chemnitz-<br />

Zwickau, 1995<br />

[Eckey02] Hans-Friedrich Eckey & Reinhold Kosfeld & Christian Dreger,<br />

Statistik, Gabler Verlag, Wiesbaden, 3. Edition, 2002<br />

[Gentle02] James E. Gentle, Elements of Computational Statistics, Springer<br />

Verlag, New York, 1. Edition, 2002<br />

[Hagen97] Claudia Hagen, Neuronale Netze zur statistischen Datenanalyse,<br />

Dissertation, Technische Hochschule Darmstadt, 1997<br />

[Hawibowo97] Singgih Hawibowo, Sicherheitstechnische Abschätzung des<br />

Betriebszustandes von Pumpen zur Schadensfrüherkennung,<br />

Dissertation, Technische Universität Berlin, 1997<br />

[Heinzelmann99] Dipl.-Ing. Andreas Heinzelmann, Produktintegrierte Diagnose<br />

komplexer mobiler Systeme, VDI Verlag, Düsseldorf, VDI Reihe<br />

12, Nr. 391, 1999<br />

[Heuer97] Jürgen Heuer, Neuronale Netze in der Industrie, Gabler Verlag,<br />

Wiesbaden, 1. Edition, 1997<br />

107


[Holland01] Heinrich Holland & Kurt Scharnbacher, Grundlagen der Statistik,<br />

Gabler Verlag, Wiesbaden, 5. Edition, 2001<br />

[Jondral02] Friedrich Jondral & Anne Wiesler, Wahrscheinlichkeitsrechnung<br />

und stochastische Prozesse, B.G. Teubner, Stuttgart, 2. Edition,<br />

2002<br />

[Kolerus95] Josef Kolerus, Zustandsüberwachung von Maschinen, Expert<br />

Verlag, Renningen-Malmsheim, 2. Edition, 1995<br />

[Krallmann05] Jens Krallmann, Einsatz eines Multisensors für ein Condition<br />

Monitoring von mobilen Arbeitsmaschinen, Dissertation, TU<br />

Braunschweig, 2005<br />

[Krems94] Josef F. Krems, Wissensbasierte Urteilsbildung, Hans Huber<br />

Verlag, Göttingen, 1. Edition, 1994<br />

[Lusti02] Markus Lusti, Data Warehousing und Data Mining, Springer<br />

Verlag, Berlin, 2. Edition, 2002<br />

[Martin98] Wolfgang Martin, Data Warehousing – Data Mining – OLAP,<br />

Thomson Publishing International, Bonn, 1. Edition, 1998<br />

[Masing88] Dr. Walter Masing, Handbuch der Qualitätssicherung, Carl<br />

Hanser Verlag, München, 2. Edition, 1988<br />

[Multhaupt00] Marko Multhaupt, Data Mining und Text Mining im strategischen<br />

Controlling, Shaker Verlag, Aachen, 1. Edition, 2000<br />

[Nauth05] Peter Nauth, Embedded Intelligent Systems, Oldenbourg Verlag,<br />

München, 1. Edition, 2005<br />

[Pitter01] Frank Pitter, Verfügbarkeitssteigerung von Werkzeugmaschinen<br />

durch Einsatz mechatronischer Sensorlösungen, Meisenbach<br />

Verlag, Bamberg, 1. Edition, 2001<br />

[Sick00] Dipl.Inform. Bernhard Sick, Signalinterpretation mit Neuronalen<br />

Netzen unter Nutzung von modellbasierten Nebenwissen am<br />

Beispiel der Verschleißüberwachung von Werkzeugen in CNC-<br />

Drehmaschinen, VDI Verlag, Düsseldorf, VDI Reihe 10, Nr. 629,<br />

2000<br />

[Scharnbacher04] Heinrich Holland & Kurt Scharnbacher, Grundlagen statistischer<br />

Wahrscheinlichkeiten, Gabler Verlag, Wiesbaden, 1. Edition,<br />

2004<br />

[Turunen99] Esko Turunen, Mathematics behind Fuzzy Logic, Physica Verlag,<br />

Heidelberg, 1. Edition, 1999<br />

[Waldmann04] Karl-Heinz Waldmann & Ulrike M. Stocker, Stochastische<br />

Modelle, Springer Verlag, Berlin, 1. Edition, 2004<br />

108


[Wittenberg98] Reinhard Wittenberg, Grundlagen computerunterstützter<br />

Datenanalyse – Band 1, Lucius & Lucius, Stuttgart, 2. Edition,<br />

1998<br />

Interviewee:<br />

[Nijmegen06] Several Employees at<br />

UMC St. Radboud (University Hospital of Nijmegen, the<br />

Netherlands)<br />

Date: June 2 nd , 2006<br />

[Weerdesteyn06] Han Weerdesteyn<br />

Product Manager of <strong>Contell</strong>/IKS<br />

WebPages:<br />

[2DI2006]<br />

[3M2006]<br />

[AES06]<br />

[DeltaTRAK06]<br />

[Rees06]<br />

[Triple06]<br />

[UniMunich06]<br />

Two Dimensional Instruments, LLC.<br />

(http://www.e2di.com/thermaviewer.html)<br />

Last visit: November 29 th , 2006<br />

3M Worldwide<br />

(http://solutions.3m.com/wps/portal/3M/en_US/Microbiology/FoodS<br />

afety/products/time-temperature-indicators/)<br />

Last visit: November 28 th , 2006<br />

AES Chemunex<br />

(http://www.aes-labguard.com)<br />

Last visit: November 29 th , 2006<br />

DeltaTRAK<br />

(http://www.deltatrak.com/thermo_cdx.shtml)<br />

Last visit: November 29 th , 2006<br />

Rees Scientific<br />

(http://www.reesscientific.com/Centron.htm)<br />

Last visit: November 29 th , 2006<br />

Triple Red – Laboratory Technology<br />

(http://www.triplered.com/Products/alarms.htm)<br />

Last visit: November 29 th , 2006<br />

University of Munich<br />

(http://leifi.physik.uni-muenchen.de/web_ph09/umwelt_technik<br />

/07kuehlschrank/kuehlschrank.htm)<br />

Last visit: November 8 th , 2006<br />

109


Other Sources:<br />

[DEMO06]<br />

[UMC06]<br />

Exported data and screenshots<br />

<strong>Contell</strong>/IKS demo system<br />

Date of export: June – November, 2006 (according to<br />

requirements)<br />

Exported operating data<br />

UMC St. Radboud (University Hospital of Nijmegen, the<br />

Netherlands)<br />

Date of Export: June 1st, 2001<br />

110


Appendix 1 – Implementation of Interpolation<br />

As already explained in section 6.1 the collected datasets have to be interpolated to<br />

obtain constant time intervals between single measuring values. This interpolation is<br />

done by the following algorithm.<br />

The basic steps of this algorithm are:<br />

1. Import of the monitoring data<br />

2. Conversion of date and time to the right format<br />

3. Interpolation of a measurement value for every single minute<br />

(Number of door openings is stored to the beginning of a minute)<br />

4. Storage of the calculated values to disk<br />

5. Reimport of calculated values from disk for validation purposes<br />

To be able to import the CSV files from XiltriX, they have to be adapted, as already<br />

mentioned in section 6.1. The reason for that is a different usage of delimiters. XiltriX<br />

exports the data with a point as thousands separator and a comma as decimal<br />

separator. Matlab interprets the point as decimal separator and the comma as<br />

separator. To solve this problem, two simple replacements have to be done with a<br />

text editor in the following order:<br />

1. Replace “.” with “”<br />

2. Replace “,” with “.”<br />

An experienced programmer may recognize that the following codes are all coded in<br />

iterative manner and not object-oriented. The reason for that is the limited possibility<br />

Matlab offers. Indeed, it is possible to encapsulate at least procedures to so called M-<br />

files. 85 But they have a significant negative influence on the runtime. This<br />

phenomenon has to be traced back on the internal data exchange behavior. Hence,<br />

a simple iterative structure is used.<br />

Another problem of Matlab is the nonexistence of well scaling data types like linked<br />

lists. Hence, the following algorithm slows down very fast. The first thousand values,<br />

for instance, are calculated in about 45 seconds. That is nearly 10 times faster, than<br />

the second thousand values. The next thousand values take even more calculation<br />

time. 86 As the collected datasets contain about 37000 values, calculation would take<br />

85 See (e.g.[Benker01], p. 48-55)<br />

86 Tests were made with a Pentium 3 mobile, 1GHz, 256MB Ram<br />

A-111


hours to days. The found solution is to store the intermediate data every 250 values<br />

to disk. This leads to a running time of about 26 minutes for a 37000 value dataset.<br />

The actual implementation is printed on the following pages.<br />

A-112


%Name and location of the source file<br />

unit = 1;<br />

filename = strcat('Channel', int2str(unit), '.csv');<br />

path = 'C:\Dokumente und Einstellungen\Christian\Eigene<br />

Dateien\Dokumente\Studium\Diplomarbeit\Monitoring Data Nijmegen<br />

(Converted)\';<br />

%Import Dataset<br />

import = importdata(strcat(path, filename));<br />

%If Doorsensor is not available, add a 0 column (for compatibility reasons)<br />

if length(import.data(1,:)) == 5;<br />

import.data(:,6) = 0;<br />

disp('No Doorsensor installed! => Column added');<br />

end<br />

%Create Datevector (as serial date number):<br />

date = datenum(import.textdata(:,1), 'dd-mm-yy HH:MM:SS');<br />

%Algorithm for interpolation<br />

%Definition of a second<br />

second = 1/(60*60*24);<br />

%Definition of a minute (for performance reasons)<br />

minute = 1/(60*24);<br />

%Current position within import-vector<br />

position = 1;<br />

%Length of the data-vector (for performance reasons)<br />

datalength = length(import.data(:,2));<br />

%New Matrix for the interpolated data: (Contains: Date/Time, Interpolated<br />

Temperature, Lower Border, Upper Border)<br />

ID = [];<br />

%next save positions (see below)<br />

saveposition = 250;<br />

disp(strcat('Start of Computation:_', datestr(now)));<br />

%Initialise time to first complete minute of imported data and the starting<br />

position;<br />

starttime = (date(position) - mod(date(position),minute)) + minute;<br />

while date(position + 1)


if ~isnan(import.data(i,6));<br />

dooropenings = dooropenings + import.data(i,6);<br />

else disp(strcat('NaN found at_: ', datestr(date(position))));<br />

end<br />

end<br />

for i = starttime:minute:date(position+jumplength);<br />

ID = [ID; [i, round(10 * interp1([date(position),<br />

date(position+jumplength)],[import.data(position,2),<br />

import.data(position+jumplength,2)],i,'linear'))/10,dooropenings,import.dat<br />

a(position,4),import.data(position,5)]];<br />

dooropenings = 0; %To make sure, that number of dooropenings is only<br />

added once<br />

starttime = starttime + minute;<br />

%Correct calculation mistakes<br />

if mod(starttime,minute) >= second;<br />

starttime = (starttime - mod(starttime,minute)) + minute;<br />

end;<br />

end<br />

position = position + jumplength;<br />

%store to disk, if next 250 positions are reached (performance reasons)<br />

if position >= saveposition;<br />

dlmwrite(strcat(path, filename, '- Interpolated.txt'),ID,<br />

'delimiter', ';', 'newline', 'pc', 'precision', '%.12f', '-append');<br />

ID = [];<br />

saveposition = saveposition + 250;<br />

end<br />

end<br />

%Save the rest<br />

dlmwrite(strcat(path, filename, '- Interpolated.txt'),ID, 'delimiter', ';',<br />

'newline', 'pc', 'precision', '%.12f', '-append');<br />

disp(strcat('End of Computation:_', datestr(now)));<br />

%Import file back from disk<br />

interpolation = importdata(strcat(path, filename, '- Interpolated.txt'));<br />

%Show Summary of the imported data<br />

%Count Dooropenings in Original File<br />

dooropenings = 0;<br />

for i = 1:length(import.data);<br />

if ~isnan(import.data(i,6));<br />

dooropenings = dooropenings + import.data(i,6);<br />

end<br />

end<br />

disp(strcat('Dooropenings (Original File):_', int2str(dooropenings)));<br />

disp(strcat('Dooropenings (Interpolated File):_',<br />

int2str(sum(interpolation(:,3)))));<br />

disp(strcat('Dataset Starting Time:_', datestr(interpolation(1,1))));<br />

disp(strcat('Dataset Ending Time:_',<br />

datestr(interpolation(length(interpolation),1))));<br />

A-114


Appendix 2 – Implementation of Statistical Methods<br />

This section will introduce the implementation of the suggested statistical data<br />

analysis. As described in section 6.1, the promising statistical measures are<br />

calculated on daily basis (whole day, daytime and nighttime). All results are exported<br />

to Microsoft Excel files to allow additional data analysis. Moreover, the graphs from<br />

chapter 6 are also plotted and saved to disk.<br />

Basic steps of this implementation are:<br />

1. Import of monitoring data and interpolated data<br />

2. Calculation of daily minimum and maximum (whole day, daytime, nighttime)<br />

(based on non-interpolated data)<br />

3. Calculation of daily mean, mode, median, standard deviation<br />

(whole day, daytime, nighttime) (based on interpolated data)<br />

4. Calculation of daily door openings (whole day, daytime, nighttime)<br />

5. Calculation of temperature distribution<br />

6. Creation of graphs<br />

7. Storage of calculated values and graphs to disk<br />

The actual implementation is printed on the following pages.<br />

A-115


%Name and location of the source file<br />

unit = 1;<br />

filename = strcat('Channel', int2str(unit), '.csv');<br />

path = 'C:\Dokumente und Einstellungen\Christian\Eigene<br />

Dateien\Dokumente\Studium\Diplomarbeit\Monitoring Data Nijmegen<br />

(Converted)\';<br />

%Import Dataset<br />

import = importdata(strcat(path, filename));<br />

%Create Datevector (as serial date number):<br />

date = datenum(import.textdata(:,1), 'dd-mm-yy HH:MM:SS');<br />

%Import Interpolated Data from Disk<br />

interpolation = importdata(strcat(path, filename, '- Interpolated.txt'));<br />

%Definition of a Second<br />

second = 1/(24*60*60);<br />

%Definition of a Minute (For Performance Reasons)<br />

minute = 1/(60*24);<br />

%Definition of Day- and Nighttime<br />

daybegin = (1/24)*6;<br />

dayend = (1/24)*22;<br />

%Start (Index of the imported data)<br />

%1 = Begin of imported file, add 1440 per Day<br />

start = 1 + 7*1440;<br />

%-----Minima & Maxima-----<br />

%(Use non interpolated data)<br />

%Auxiliary variable<br />

startposition = 1;<br />

while floor(date(startposition) + second) < floor(interpolation(start,1) +<br />

second);<br />

startposition = startposition + 1;<br />

end<br />

jumplength = 0;<br />

%Create Datevector for first column (per day)<br />

dailydate = floor(date(startposition) + second);<br />

%Create Minvector for second column (per day)<br />

minvector = [];<br />

%Create Mindaytimevector for third column (per day)<br />

mindaytime = [];<br />

%Create Minnighttimevector for forth column (per day)<br />

minnighttime = [];<br />

%The same for the maxima table<br />

maxvector = [];<br />

maxdaytime = [];<br />

maxnighttime = [];<br />

for i = startposition:length(date);<br />

if isequal(floor(date(startposition)+ second), floor(date(i) + second));<br />

jumplength = jumplength + 1;<br />

else<br />

A-116


dailydate = [dailydate; floor(date(i) + second)];<br />

%Vector for daily minimum & maximun<br />

minvector = [minvector; min(import.data(startposition:startposition +<br />

jumplength - 1,2))];<br />

maxvector = [maxvector; max(import.data(startposition:startposition +<br />

jumplength - 1,2))];<br />

%Compute Day- and Nighttime Values (Daytime: See Definiton of<br />

Daybegin & Dayend)<br />

nighttemp = [];<br />

daytemp = [];<br />

for j= startposition:startposition + jumplength - 1;<br />

if (mod(date(j),1) >= daybegin) && (mod(date(j),1) = daybegin) && (mod(date(j) + second, 1)<br />


xlswrite(strcat(path, 'Excel\', filename, '- Maxima'), [dailydate-693960,<br />

maxvector, maxdaytime, maxnighttime], 'Maxima', 'A2');<br />

%Total Values<br />

totalmin = min(import.data(:,2));<br />

totalmax = max(import.data(:,2));<br />

%-----Mean, Median, Mode, Standard Deviation-----<br />

%(Use interpolated data)<br />

%Create Date Vector<br />

interpolateddailydate = floor(interpolation(start,1) + second);;<br />

%create Mean Vectors<br />

meanvector = [];<br />

meandaytime = [];<br />

meannighttime = [];<br />

%create Median Vectors<br />

medianvector = [];<br />

mediandaytime = [];<br />

mediannighttime = [];<br />

%create Mode Vectors<br />

modevector = [];<br />

modedaytime = [];<br />

modenighttime = [];<br />

%create Standard Deviation Vectors<br />

stdvector = [];<br />

stddaytime = [];<br />

stdnighttime = [];<br />

%Create Vectors for Number of Dooropenings<br />

dailydooropenings = [];<br />

daytimedooropenings = [];<br />

nighttimedooropenings = [];<br />

%Auxiliary Variables<br />

startposition = start;<br />

jumplength = 0;<br />

for i = startposition:length(interpolation);<br />

if isequal(floor(interpolation(startposition,1) + second),<br />

floor(interpolation(i,1) + second));<br />

jumplength = jumplength + 1;<br />

else %This is called, when date changes...<br />

interpolateddailydate = [interpolateddailydate;<br />

floor(interpolation(i,1) + second)];<br />

%Vectors for Daily Values (Mean, Median, Mode, Standard Deviation)<br />

meanvector = [meanvector;<br />

mean(interpolation(startposition:startposition + jumplength - 1,2))];<br />

medianvector = [medianvector;<br />

median(interpolation(startposition:startposition + jumplength - 1,2))];<br />

modevector = [modevector;<br />

mode(interpolation(startposition:startposition + jumplength - 1,2))];<br />

stdvector = [stdvector; std(interpolation(startposition:startposition<br />

+ jumplength - 1,2))];<br />

A-118


%Count Dooropenings per Day<br />

dailydooropenings = [dailydooropenings;<br />

sum(interpolation(startposition:startposition + jumplength - 1,3))];<br />

%Compute Day- and Nighttime Values<br />

%(Mean, Median, Mode, Standard Deviation)<br />

nighttemp = [];<br />

daytemp = [];<br />

for j= startposition:startposition + jumplength - 1;<br />

if (mod(interpolation(j,1) + second, 1) >= daybegin) &&<br />

(mod(interpolation(j,1) + second, 1)


medianvector = [medianvector;<br />

median(interpolation(startposition:startposition + jumplength - 1,2))];<br />

modevector = [modevector; mode(interpolation(startposition:startposition +<br />

jumplength - 1,2))];<br />

stdvector = [stdvector; std(interpolation(startposition:startposition +<br />

jumplength - 1,2))];<br />

%Day- and Nighttime...<br />

nighttemp = [];<br />

daytemp = [];<br />

for j= startposition:startposition + jumplength - 1;<br />

if (mod(interpolation(j,1) + second, 1) >= daybegin) &&<br />

(mod(interpolation(j,1) + second, 1)


xlswrite(strcat(path, 'Excel\', filename, '- Median'),<br />

[interpolateddailydate-693960, round(medianvector*10)/10,<br />

round(mediandaytime*10)/10, round(mediannighttime*10)/10], 'Median', 'A2');<br />

xlswrite(strcat(path, 'Excel\', filename, '- Mode'),<br />

[interpolateddailydate-693960, round(modevector*10)/10,<br />

round(modedaytime*10)/10, round(modenighttime*10)/10], 'Mode', 'A2');<br />

xlswrite(strcat(path, 'Excel\', filename, '- Standard Deviation'),<br />

[interpolateddailydate-693960, round(stdvector*10)/10,<br />

round(stddaytime*10)/10, round(stdnighttime*10)/10], 'Standard Deviation',<br />

'A2');<br />

xlswrite(strcat(path, 'Excel\', filename, '- Doordopenings'),<br />

[interpolateddailydate-693960, dailydooropenings, daytimedooropenings,<br />

nighttimedooropenings], 'Dooropenings', 'A2');<br />

%Total Values<br />

totalmean = mean(interpolation(:,2));<br />

totalmedian = median(interpolation(:,2));<br />

totalmode = mode(interpolation(:,2));<br />

totalstd = std(interpolation(:,2));<br />

totaldooropenings = sum(interpolation(:,3));<br />

%-----Temperature Distribution-----<br />

%Count total occurrences of single values<br />

%"Round" Command necessary in MATLAB. Otherwise some comparisons fail!<br />

%Contains[Temperature, Minutes of Occurence]<br />

totalOC = [];<br />

for i = min(interpolation(:,2)):0.1:max(interpolation(:,2));<br />

totalOC = [totalOC; [i, sum(interpolation(:,2) == round(i*10)/10)]];<br />

end<br />

%-----Plot Statements-----<br />

%Temperature Overview<br />

plot(date, import.data(:,2), 'k');<br />

hold on;<br />

plot(interpolation(:,1), interpolation(:,4), '--b')<br />

plot(interpolation(:,1), interpolation(:,5), '--r')<br />

datetick('x',20, 'keeplimits');<br />

title 'Temperature Overview';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([min(date) max(date) min(import.data(:,2)) max(import.data(:,2))]);<br />

hold off;<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Temperature<br />

Overview.tif'));<br />

%Maximum Values per Day<br />

bar([dailydate(1):dailydate(length(maxvector))], maxvector, 'k');<br />

hold on;<br />

plot(interpolation(:,1), interpolation(:,4), '--b')<br />

plot(interpolation(:,1), interpolation(:,5), '--r')<br />

datetick('x',20, 'keeplimits');<br />

title 'Maximum Values per Day';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(maxvector)) min(maxvector)<br />

max(maxvector)]);<br />

A-121


hold off;<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Maximum Values per<br />

Day.tif'));<br />

%Maximum Values at Daytime<br />

bar([dailydate(1):dailydate(length(maxdaytime))], maxdaytime, 'k');<br />

hold on;<br />

plot(interpolation(:,1), interpolation(:,4), '--b')<br />

plot(interpolation(:,1), interpolation(:,5), '--r')<br />

datetick('x',20, 'keeplimits');<br />

title 'Maximum Values at Daytime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(maxvector)) min(maxvector)<br />

max(maxvector)]);<br />

hold off;<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Maximum Values at<br />

Daytime.tif'));<br />

%Maximum Values at Nighttime<br />

bar([dailydate(1):dailydate(length(maxnighttime))], maxnighttime, 'k');<br />

hold on;<br />

plot(interpolation(:,1), interpolation(:,4), '--b')<br />

plot(interpolation(:,1), interpolation(:,5), '--r')<br />

datetick('x',20, 'keeplimits');<br />

title 'Maximum Values at Nighttime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(maxvector)) min(maxvector)<br />

max(maxvector)]);<br />

hold off;<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Maximum Values at<br />

Nighttime.tif'));<br />

%Minimum Values per Day<br />

bar([dailydate(1):dailydate(length(minvector))], minvector, 'k');<br />

hold on;<br />

plot(interpolation(:,1), interpolation(:,4), '--b')<br />

plot(interpolation(:,1), interpolation(:,5), '--r')<br />

datetick('x',20, 'keeplimits');<br />

title 'Minimum Values per Day';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(minvector)) min(minvector)<br />

max(minvector)]);<br />

hold off;<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Minimum Values per<br />

Day.tif'));<br />

%Minimum Values at Daytime<br />

bar([dailydate(1):dailydate(length(mindaytime))], mindaytime, 'k');<br />

hold on;<br />

plot(interpolation(:,1), interpolation(:,4), '--b')<br />

plot(interpolation(:,1), interpolation(:,5), '--r')<br />

datetick('x',20, 'keeplimits');<br />

title 'Minimum Values at Daytime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(minvector)) min(minvector)<br />

max(minvector)]);<br />

hold off;<br />

A-122


print('-dtiff', strcat(path, 'Graphs\', filename, '- Minimum Values at<br />

Daytime.tif'));<br />

%Minimum Values at Nighttime<br />

bar([dailydate(1):dailydate(length(minnighttime))], minnighttime, 'k');<br />

hold on;<br />

plot(interpolation(:,1), interpolation(:,4), '--b')<br />

plot(interpolation(:,1), interpolation(:,5), '--r')<br />

datetick('x',20, 'keeplimits');<br />

title 'Minimum Values at Nighttime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(minvector)) min(minvector)<br />

max(minvector)]);<br />

hold off;<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Minimum Values at<br />

Nighttime.tif'));<br />

%Mean Values per Day<br />

bar([dailydate(1):dailydate(length(meanvector))], meanvector, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Mean Values per Day';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(meanvector)) min(meanvector)<br />

max(meanvector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Mean Values per<br />

Day.tif'));<br />

%Mean Values at Daytime<br />

bar([dailydate(1):dailydate(length(meandaytime))], meandaytime, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Mean Values at Daytime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(meanvector)) min(meanvector)<br />

max(meanvector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Mean Values at<br />

Daytime.tif'));<br />

%Mean Values at Nighttime<br />

bar([dailydate(1):dailydate(length(meannighttime))], meannighttime, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Mean Values at Nighttime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(meanvector)) min(meanvector)<br />

max(meanvector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Mean Values at<br />

Nighttime.tif'));<br />

%Median Values per Day<br />

bar([dailydate(1):dailydate(length(medianvector))], medianvector, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Median Values per Day';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(medianvector)) min(medianvector)<br />

max(medianvector)]);<br />

A-123


print('-dtiff', strcat(path, 'Graphs\', filename, '- Median Values per<br />

Day.tif'));<br />

%Median Values at Daytime<br />

bar([dailydate(1):dailydate(length(mediandaytime))], mediandaytime, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Median Values at Daytime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(medianvector)) min(medianvector)<br />

max(medianvector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Median Values at<br />

Daytime.tif'));<br />

%Median Values at Nighttime<br />

bar([dailydate(1):dailydate(length(mediannighttime))], mediannighttime,<br />

'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Median Values at Nighttime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(medianvector)) min(medianvector)<br />

max(medianvector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Median Values at<br />

Nighttime.tif'));<br />

%Mode Values per Day<br />

bar([dailydate(1):dailydate(length(modevector))], modevector, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Mode Values per Day';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(modevector)) min(modevector)<br />

max(modevector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Mode Values per<br />

Day.tif'));<br />

%Mode Values at Daytime<br />

bar([dailydate(1):dailydate(length(modedaytime))], modedaytime, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Mode Values at Daytime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(modevector)) min(modevector)<br />

max(modevector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Mode Values at<br />

Daytime.tif'));<br />

%Mode Values at Nighttime<br />

bar([dailydate(1):dailydate(length(modenighttime))], modenighttime, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Mode Values at Nighttime';<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(modevector)) min(modevector)<br />

max(modevector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Mode Values at<br />

Nighttime.tif'));<br />

%Standard Deviation per Day<br />

A-124


ar([dailydate(1):dailydate(length(stdvector))], stdvector, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title ({'Standard Deviation per Day'; strcat('(',<br />

num2str(round(totalmean*10)/10), '°C Mean Value)')});<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(stdvector)) min(stdvector)<br />

max(stdvector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Standard Deviation per<br />

Day.tif'));<br />

%Standard Deviation at Daytime<br />

bar([dailydate(1):dailydate(length(stddaytime))], stddaytime, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title ({'Standard Deviation at Daytime'; strcat('(',<br />

num2str(round(totalmean*10)/10), '°C Mean Value)')});<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(stdvector)) min(stdvector)<br />

max(stdvector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Standard Deviation at<br />

Daytime.tif'));<br />

%Standard Deviation at Nighttime<br />

bar([dailydate(1):dailydate(length(stdnighttime))], stdnighttime, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title ({'Standard Deviation at Nighttime'; strcat('(',<br />

num2str(round(totalmean*10)/10), '°C Mean Value)')});<br />

xlabel 'Date';<br />

ylabel 'Temperature (°C)';<br />

axis([dailydate(1) dailydate(length(stdvector)) min(stdvector)<br />

max(stdvector)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Standard Deviation at<br />

Nighttime.tif'));<br />

%Temperature Distribution<br />

bar(min(totalOC(:,1)):0.1:max(totalOC(:,1)), totalOC(:,2), 'k')<br />

title 'Total Occurence of Temperature Values';<br />

xlabel 'Temperature (°C)';<br />

ylabel 'Time (Minutes)';<br />

axis([totalOC(1,1) totalOC(length(totalOC),1) min(totalOC(:,2))<br />

max(totalOC(:,2))]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Total Occurence of<br />

Temperature Values.tif'));<br />

%If Doorsensor is installed...<br />

if max(dailydooropenings) > 0;<br />

%Dooropenings per Day<br />

bar([dailydate(1):dailydate(length(dailydooropenings))],<br />

dailydooropenings, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Dooropenings per Day';<br />

xlabel 'Date';<br />

ylabel 'Number of Dooropenings';<br />

axis([dailydate(1) dailydate(length(dailydooropenings))<br />

min(dailydooropenings) max(dailydooropenings)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Dooropenings per<br />

Day.tif'));<br />

%Dooropenings at Daytime<br />

A-125


ar([dailydate(1):dailydate(length(daytimedooropenings))],<br />

daytimedooropenings, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Dooropenings at Daytime';<br />

xlabel 'Date';<br />

ylabel 'Number of Dooropenings';<br />

axis([dailydate(1) dailydate(length(dailydooropenings))<br />

min(dailydooropenings) max(dailydooropenings)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Dooropenings at<br />

Daytime.tif'));<br />

%Dooropenings at Nighttime<br />

bar([dailydate(1):dailydate(length(nighttimedooropenings))],<br />

nighttimedooropenings, 'k');<br />

datetick('x',20, 'keeplimits');<br />

title 'Dooropenings at Nighttime';<br />

xlabel 'Date';<br />

ylabel 'Number of Dooropenings';<br />

axis([dailydate(1) dailydate(length(dailydooropenings))<br />

min(dailydooropenings) max(dailydooropenings)]);<br />

print('-dtiff', strcat(path, 'Graphs\', filename, '- Dooropenings at<br />

Nighttime.tif'));<br />

end<br />

A-126


Appendix 3 – Implementation of Data Mining Methods<br />

This section will introduce the implementation of the suggested data mining methods.<br />

As described in section 6.1.<br />

Basic steps of this implementation are:<br />

1. Import of monitoring data and interpolated data<br />

2. Determination of alarms (original limits, self-defined limits)<br />

3. Determination of alarms (with door opening recognition)<br />

4. Determination of alarm durations and the corresponding classification limits<br />

5. Determination maximum alarm temperatures and the corresponding<br />

classification limits<br />

6. Determination of the duration of door openings and the corresponding<br />

classification limits<br />

The actual implementation is printed on the following pages.<br />

A-127


%Name and location of the source file<br />

unit = 1;<br />

filename = strcat('Channel', int2str(unit), '.csv');<br />

path = 'C:\Dokumente und Einstellungen\Christian\Eigene<br />

Dateien\Dokumente\Studium\Diplomarbeit\Monitoring Data Nijmegen<br />

(Converted)\';<br />

%Import Dataset<br />

import = importdata(strcat(path, filename));<br />

%Create Datevector (as serial date number):<br />

date = datenum(import.textdata(:,1), 'dd-mm-yy HH:MM:SS');<br />

%Import Interpolated Data from Disk<br />

interpolation = importdata(strcat(path, filename, '- Interpolated.txt'));<br />

%Definition of a Second<br />

second = 1/(24*60*60);<br />

%Definition of a Minute (For Performance Reasons)<br />

minute = 1/(60*24);<br />

%Definition of Day- and Nighttime<br />

daybegin = (1/24)*6;<br />

dayend = (1/24)*22;<br />

%Start (Index of the imported data)<br />

%1 = Begin of imported file, add 1440 per Day<br />

start = 1 + 7*1440;<br />

%Self defined limits:<br />

upperLimit = 6;<br />

lowerLimit = 2;<br />

DoorOffset = 3; %Offset in Minutes<br />

%Determine Number of Occured Alarms<br />

%-----using the Original Limits!-----<br />

%Contains Date and Information, which Kind of Alarm<br />

%(1 = Above High Temperature Border; -1 = Below Low Temperature Border)<br />

%and Duration<br />

%Contains: [Date, Type of Alarm, Duration, Maximum Temperature]<br />

AlarmsOL = [];<br />

Alarmbefore = 0;<br />

Duration = 0;<br />

maxtemp = 0;<br />

for i = start:length(interpolation(:,1));<br />

if (interpolation(i,2) >= interpolation(i,5)) && (Alarmbefore == 0);<br />

%Get Duration<br />

k = i;<br />

while (k =<br />

interpolation(k,5));<br />

k = k + 1;<br />

end<br />

Duration = k - i;<br />

maxtemp = max(interpolation(i:k,2));<br />

AlarmsOL = [AlarmsOL; [interpolation(i,1), 1, Duration, maxtemp]];<br />

Alarmbefore = 1;<br />

A-128


Duration = 0;<br />

maxtemp = 0;<br />

elseif (interpolation(i,2)


Alarmbefore = 0;<br />

end<br />

end<br />

end<br />

%-----The Same Calculation with self defined Limits-----<br />

%-----(Ignore Alarms after Dooropenings in Offset Time)-----<br />

%Contains: [Date, Type of Alarm, Duration, Maximum Temperature]<br />

AlarmsDLNoDoor = [];<br />

Alarmbefore = 0;<br />

Duration = 0;<br />

maxtemp = 0;<br />

for i = (start + DoorOffset):length(interpolation(:,1))-1;<br />

%High-Temperature Alarm<br />

if (interpolation(i,2) >= upperLimit) & (Alarmbefore == 0);<br />

%Only, if there was no dooropening...<br />

if sum(interpolation(i-DoorOffset:i+1,3)) == 0;<br />

%Get Duration<br />

k = i;<br />

while (k =<br />

upperLimit);<br />

k = k + 1;<br />

end<br />

Duration = k - i;<br />

maxtemp = max(interpolation(i:k,2));<br />

AlarmsDLNoDoor = [AlarmsDLNoDoor; [interpolation(i,1), 1,<br />

Duration, maxtemp]];<br />

end<br />

Alarmbefore = 1;<br />

Duration = 0;<br />

maxtemp = 0;<br />

%Low-Temperature Alarm<br />

elseif (interpolation(i,2)


%Contains [Duration, Number of Occurences, Percentage, Accumulated<br />

%Percentage]<br />

DurationDL = [];<br />

OccurenceTemp = 0; %For Performance Reason<br />

for i = min(AlarmsDL(:,3)):max(AlarmsDL(:,3));<br />

OccurenceTemp = histc(AlarmsDL(:,3), round(i*10)/10);<br />

if isempty(DurationDL);<br />

DurationDL = [DurationDL; [i, OccurenceTemp,<br />

(OccurenceTemp/length(AlarmsDL(:,1)))*100,<br />

(OccurenceTemp/length(AlarmsDL(:,1)))*100]];<br />

OccurenceTemp = 0;<br />

else<br />

if OccurenceTemp > 0;<br />

DurationDL = [DurationDL; [i, OccurenceTemp,<br />

(OccurenceTemp/length(AlarmsDL(:,1)))*100,<br />

sum(DurationDL(1:length(DurationDL(:,1)),3)) +<br />

(OccurenceTemp/length(AlarmsDL(:,1)))*100]];<br />

end<br />

OccurenceTemp = 0;<br />

end<br />

end<br />

%-----Check Probability of Current Temperature (within Alarming Situations)<br />

%(Calculated by using the maximum values per alarm)<br />

%Contains [Maximum Temperature, Number of Occurences, Percentage,<br />

Accumulated<br />

%Percentage]<br />

ProbabilityDL = [];<br />

OccurenceTemp = 0; %For Performance Reason<br />

for i = min(AlarmsDL(:,4)):0.1:max(AlarmsDL(:,4));<br />

OccurenceTemp = histc(AlarmsDL(:,4), round(i*10)/10);<br />

if isempty(ProbabilityDL);<br />

ProbabilityDL = [ProbabilityDL; [i, OccurenceTemp,<br />

(OccurenceTemp/length(AlarmsDL(:,1)))*100,<br />

(OccurenceTemp/length(AlarmsDL(:,1)))*100]];<br />

OccurenceTemp = 0;<br />

else<br />

if OccurenceTemp > 0;<br />

ProbabilityDL = [ProbabilityDL; [i, OccurenceTemp,<br />

(OccurenceTemp/length(AlarmsDL(:,1)))*100,<br />

sum(ProbabilityDL(1:length(ProbabilityDL(:,1)),3)) +<br />

(OccurenceTemp/length(AlarmsDL(:,1)))*100]];<br />

end<br />

OccurenceTemp = 0;<br />

end<br />

end<br />

%-----Durations of Dooropenings-----<br />

%Contains: [Date, Duration (in seconds)]<br />

Dooropeningtime = [];<br />

%Get Startingposition for non interpolated data<br />

startposition = 1;<br />

while floor(date(startposition) + second) < floor(interpolation(start,1) +<br />

second);<br />

startposition = startposition + 1;<br />

end<br />

% Get Duration of Dooropenings<br />

A-131


for i = startposition:length(import.data(:,2)-1);<br />

if (import.data(i,6) == 1) & (import.data(i+1,6) == 0);<br />

Dooropeningtime = [Dooropeningtime; [date(i), round((date(i+1)-<br />

date(i))*60*60*24)]];<br />

else<br />

if (import.data(i,6) == 1) & (import.data(i+1,6) == 1);<br />

jumplength = 0;<br />

while (import.data(i+jumplength,6) == 1) & (i + jumplength 0;<br />

DoorProbability = [DoorProbability; [i, OccurenceTemp,<br />

(OccurenceTemp/length(Dooropeningtime(:,1)))*100,<br />

sum(DoorProbability(1:length(DoorProbability(:,1)),3)) +<br />

(OccurenceTemp/length(Dooropeningtime(:,1)))*100]];<br />

end<br />

OccurenceTemp = 0;<br />

end<br />

end<br />

%-----Display Information for a Certain Percentage-----<br />

Percentagelimit = 50;<br />

disp(strcat('Dooropening


disp(strcat('Dooropening


Erklärung (Statement)<br />

Diplomarbeit<br />

von : Christian Kaak<br />

Matr.Nr. : 2690287<br />

Thema:<br />

Ausfallprognosen mit Hilfe erweiterter Monitoring Systeme<br />

Ich versichere durch meine Unterschrift, dass ich die Arbeit selbständig und ohne Benutzung<br />

anderer als der angegebenen Hilfsmittel angefertigt habe. Alle Stellen, die wörtlich oder<br />

sinngemäß aus veröffentlichten oder unveröffentlichten Schriften entnommen sind, habe ich als<br />

solche kenntlich gemacht.<br />

Die Arbeit oder Auszüge daraus haben noch nicht in gleicher oder ähnlicher Form dieser oder<br />

einer anderen Prüfungsbehörde vorgelegen.<br />

Ich weiß, dass bei Abgabe einer falschen Versicherung die Diplom-Prüfung als nicht bestanden<br />

zu gelten hat.<br />

Braunschweig, 05.02.2007<br />

Unterschrift<br />

A-134

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!