Casestudie Breakdown prediction Contell PILOT - Transumo
Casestudie Breakdown prediction Contell PILOT - Transumo
Casestudie Breakdown prediction Contell PILOT - Transumo
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Technische Universität Braunschweig<br />
Diplomarbeit<br />
AUSFALLPROGNOSEN MIT HILFE<br />
ERWEITERTER MONITORING SYSTEME<br />
(<strong>Breakdown</strong> Prediction by the Use of Extended Monitoring Systems)<br />
von<br />
Christian Kaak<br />
Februar 2007<br />
Institut für Wirtschaftswissenschaften,<br />
Lehrstuhl für Betriebswirtschaftslehre,<br />
insbesondere Produktion und Logistik<br />
Technische Universität Braunschweig<br />
Prüfer:<br />
Prof. Dr. T. Spengler<br />
Betreuer:<br />
Dr. Grit Walther
Table of Contents<br />
Index of Figures...............................................................................................................................IV<br />
Index of Tables.................................................................................................................................V<br />
Index of Formulas............................................................................................................................VI<br />
1 Introduction................................................................................................................................ 1<br />
1.1 Initial Position and Problem ............................................................................................. 1<br />
1.2 Goals of this Study and Approach................................................................................... 1<br />
2 Sensor Based Temperature Monitoring.................................................................................... 3<br />
2.1 Importance of Temperature Monitoring within Medical Laboratories.............................. 3<br />
2.2 Functioning and Behavior of Freezers and Fridges ........................................................ 4<br />
2.2.1 General Functioning of a Fridge.................................................................................. 4<br />
2.2.2 Technical Behavior of Fridges (Without External Influences)..................................... 5<br />
2.2.3 Technical Behavior of Freezers (Without External Influences)................................... 6<br />
2.2.4 Behavior in Practice..................................................................................................... 7<br />
2.2.5 Behavior in Case of a Malfunction............................................................................... 9<br />
2.3 Current Practice of Sensor Based Temperature Monitoring......................................... 10<br />
2.4 Problems and Potential Sources of Error...................................................................... 11<br />
2.4.1 The Lack of Information Problem .............................................................................. 12<br />
2.4.2 Potential Sources of Error ......................................................................................... 14<br />
2.4.3 Methodological Problems .......................................................................................... 15<br />
2.5 Aimed Goal and Requirements Analysis....................................................................... 16<br />
3 Current Monitoring Systems ................................................................................................... 19<br />
3.1 XiltriX’s Technical Basis................................................................................................. 19<br />
3.1.1 Basic Components of a XiltriX Installation ................................................................ 21<br />
3.1.2 Other Installation Possibilities ................................................................................... 22<br />
3.2 XiltriX’s Basic Functionality............................................................................................ 23<br />
3.2.1 Current Possibilities to Display and Analyze Stored Data ........................................ 26<br />
3.2.2 Documentation of Occurred Alarms .......................................................................... 30<br />
3.3 XiltriX’s Additional Features........................................................................................... 31<br />
3.3.1 Different Types of Attachable Digital Switches ......................................................... 31<br />
3.3.2 Time-Dependent Limit Settings................................................................................. 32<br />
3.3.3 Alarm-, SMS- and E-Mail-Programs.......................................................................... 33<br />
3.4 Review of XiltriX According to the Requirements Analysis........................................... 35<br />
3.5 Other Major Monitoring Products in the Market ............................................................ 36<br />
3.5.1 3M FreezeWatch and 3M MonitorMark Indicators............................................. 37<br />
3.5.2 2DI ThermaViewer..................................................................................................... 38<br />
3.5.3 Systems Offering Data Analysis in Retrospect ......................................................... 39<br />
4 Current State of Research ...................................................................................................... 43<br />
4.1 Current State within the Setting of Sensor Based Temperature Monitoring................. 43<br />
4.2 Current State within the Setting of Machinery Condition Monitoring ............................ 43<br />
4.3 Current State within the Setting of Measurement Data Analysis .................................. 46<br />
4.3.1 Basic Approaches...................................................................................................... 46<br />
4.3.2 A Generalized Approach ........................................................................................... 47<br />
II
4.4 Review of Current State of Research............................................................................ 53<br />
5 Possible and Promising Ways of Data Analysis..................................................................... 55<br />
5.1 The Six Possible Levels of Data Analysis ..................................................................... 55<br />
5.2 Different Kinds of Statistical Analysis............................................................................ 57<br />
5.3 Basic Descriptive Statistical Measures.......................................................................... 58<br />
5.4 Regression..................................................................................................................... 60<br />
5.4.1 The Determination of Regression Functions............................................................. 61<br />
5.4.2 The Major Problems of Regression........................................................................... 63<br />
5.5 Time Series Analysis ..................................................................................................... 65<br />
5.6 Failure- and Availability Ratios ...................................................................................... 67<br />
5.7 Markov Chains............................................................................................................... 68<br />
5.8 Inferential Statistics........................................................................................................ 72<br />
5.9 Data Mining.................................................................................................................... 73<br />
5.9.1 General Fields of Application .................................................................................... 73<br />
5.9.2 Artificial Neural Networks .......................................................................................... 75<br />
5.9.3 Non-Applicability of Artificial Neural Networks to Current Datasets ......................... 78<br />
5.10 Promising Analyzing Methods ....................................................................................... 79<br />
5.10.1 Promising Appliance of Basic Descriptive Statistics............................................. 79<br />
5.10.2 Detection of Changes in Behavior by the Use of Regression............................... 81<br />
5.10.3 Classification by Using Past Behavior .................................................................. 82<br />
5.10.4 Review................................................................................................................... 83<br />
6 Implementation and Case Study............................................................................................. 86<br />
6.1 Implementation of Promising Analyzing Methods ......................................................... 86<br />
6.2 Case Study .................................................................................................................... 89<br />
6.2.1 Detection of Changes in Behavior by Using Descriptive Statistics........................... 90<br />
6.2.2 Detection of Changes in Behavior by the Use of Regression................................... 98<br />
6.2.3 Classification of Alarms by the Use of Historical Data.............................................. 99<br />
6.3 Review ......................................................................................................................... 101<br />
6.4 Recommendations....................................................................................................... 102<br />
7 Summary............................................................................................................................... 105<br />
Bibliography.................................................................................................................................. 107<br />
Appendix 1 – Implementation of Interpolation.............................................................................. 111<br />
Appendix 2 – Implementation of Statistical Methods................................................................... 115<br />
Appendix 3 – Implementation of Data Mining Methods ............................................................... 127<br />
Erklärung (Statement) .................................................................................................................. 134<br />
III
Index of Figures<br />
Figure 2-2: Temperature Sequence of a Properly Working 6°C Passive Fridge [DEMO06]........... 5<br />
Figure 2-3: Temperature Sequence of a Properly Working 6°C Active Fridge [DEMO06] ............. 6<br />
Figure 2-4: Temperature Sequence of a -80°C Active Freezer [DEMO06]..................................... 7<br />
Figure 2-5: Temperature Sequence of a -80°C Passive Freezer [DEMO06] .................................. 8<br />
Figure 2-6: Temperature Sequence of a -20°C Active Freezer [DEMO06]..................................... 8<br />
Figure 2-7: Temperature Sequence of a Cryogenic Freezer in Practical Use [UMC06] ................. 9<br />
Figure 2-8: Lack of Information Problem Caused by Sensor Based Temperature Monitoring ..... 13<br />
Figure 2-9: The Problem of Unknown Behavior between Two Single Data Points....................... 13<br />
Figure 2-10: Estimated Answers of Statistics and Data Mining..................................................... 17<br />
Figure 3-1: Flowchart of the Temperature Monitoring Task .......................................................... 20<br />
Figure 3-2: XiltriX - Schematic Drawing of an Installation with Basic Components ...................... 22<br />
Figure 3-3: XiltriX - The Main Screen [DEMO06]........................................................................... 24<br />
Figure 3-4: XiltriX - Stored Data in Table Form [DEMO06] ........................................................... 27<br />
Figure 3-5: XiltriX - Stored Data in Graphical Form [DEMO06]..................................................... 28<br />
Figure 3-6: XiltriX – Available Statistical Information [DEMO06]................................................... 29<br />
Figure 3-8: XiltriX - Time Dependent Limit Settings [DEMO06]..................................................... 33<br />
Figure 3-9: XiltriX - Setting up an Alarm Relay [DEMO06] ............................................................ 34<br />
Figure 3-13: Centron - A Sample Graph with Multiple Scales [Rees06] ....................................... 42<br />
Figure 4-1: General Overview of the Generalized Approach ([Daßler95], p. 22) (adapted) ......... 48<br />
Figure 4-2: A Delayed Trend Recognition Due to Removal of "Outliers" ...................................... 49<br />
Figure 5-1: Two Samples of Regression ([Bourier03], p. 167) (adapted)...................................... 61<br />
Figure 5-2: Incorrect Regression Function due to an Outlier ([Eckey02], p. 180) (adapted) ........ 63<br />
Figure 5-3: Correct Regression Function ([Eckey02], p.180) (adapted)........................................ 63<br />
Figure 5-4: Sales of an Industrial Heater [Chatfield04].................................................................. 66<br />
Figure 5-5: Sample Transition Probability Graph........................................................................... 70<br />
Figure 5-6: Functioning of an Artificial Neuron ([Hagen97], p. 8) (adapted).................................. 76<br />
Figure 6-1: Exported XiltriX Data (An Excerpt) .............................................................................. 86<br />
Figure 6-2: Temperature Overview of the Selected Sample Dataset............................................ 89<br />
Figure 6-3: Maximum Values at Daytime....................................................................................... 91<br />
Figure 6-4: Maximum Values at Nighttime..................................................................................... 91<br />
Figure 6-5: Minimum Values at Daytime........................................................................................ 93<br />
Figure 6-6: Minimum Values at Nighttime...................................................................................... 93<br />
Figure 6-7: Mean Values at Daytime.............................................................................................. 94<br />
Figure 6-8: Mean Values at Nighttime ........................................................................................... 94<br />
Figure 6-9: Standard Deviation at Daytime.................................................................................... 95<br />
Figure 6-10: Standard Deviation at Nighttime................................................................................ 95<br />
Figure 6-11: Daily Door Openings and Temperature Distribution of the Selected Dataset .......... 98<br />
Figure 6-12: Regression Function for the Selected Dataset.......................................................... 99<br />
IV
Index of Tables<br />
Table 2-1: Error of First and Second Kind ..................................................................................... 12<br />
Table 3-1: Listing of Existing Table Color Codes and Their Meaning ........................................... 25<br />
Table 3-2: Listing of Existing Status bar Color Codes and Their Meaning.................................... 26<br />
Table 3-3: Compliance of XiltriX According to the Requirements Analysis................................... 36<br />
Table 3-4: Compliance of 3M Indicators According to the Requirements Analysis................... 38<br />
Table 3-5: Compliance of the 2DI ThermaViewer According to the Requirements Analysis........ 39<br />
Table 4-1: Compliance of the Generalized Approach According to the Requirements Analysis .. 54<br />
Table 5-1: Estimated Improvements .............................................................................................. 85<br />
Table 6-1: Import Problems of Tested Software Products............................................................. 86<br />
Table 6-2: The Chosen Deltas ....................................................................................................... 96<br />
Table 6-3: Reported Notifications (Based on Nighttime Data)....................................................... 97<br />
Table 6-4: Classification of Alarms............................................................................................... 100<br />
Table 6-5: Results of Classification According to Single Criterions............................................. 100<br />
Table 6-6: Achieved Improvements ............................................................................................. 102<br />
V
Index of Formulas<br />
Formula 4-1: Threshold Value to Determine Potential Outliers..................................................... 49<br />
Formula 4-2: Calculation of Noise.................................................................................................. 51<br />
Formula 4-3: Calculation of Curve Stability.................................................................................... 52<br />
Formula 4-4: Calculation of Prediction Stability ............................................................................. 52<br />
Formula 5-1: The Median Formula................................................................................................. 59<br />
Formula 5-2: The Arithmetic Mean Formula .................................................................................. 59<br />
Formula 5-3: The Standard Deviation Formula.............................................................................. 60<br />
Formula 5-4: Method of Least Squares ......................................................................................... 61<br />
Formula 5-5: Method of Least Squares for an Assumed Linear Trend ......................................... 62<br />
Formula 5-6: Regression Function for Describing Linear Trend.................................................... 62<br />
Formula 5-7: Coefficient of Determination ..................................................................................... 64<br />
Formula 5-8: The Additive Component Model ............................................................................... 66<br />
Formula 5-9: The Multiplicative Component Model ....................................................................... 66<br />
Formula 5-10: The Definition of Availability [Masing88]................................................................. 68<br />
Formula 5-11: The Markov Property .............................................................................................. 68<br />
Formula 5-12: Transition Probability Matrix ................................................................................... 69<br />
Formula 5-13: Conditions for the Transition Probability Matrix...................................................... 69<br />
Formula 5-14: Sample Transition Probability Matrix...................................................................... 70<br />
Formula 5-15: Transition Probabilities of Several Changes in a Row........................................... 70<br />
Formula 5-16: Formula of Chapman-Kolmogorov ......................................................................... 70<br />
Formula 5-17: Formula of Chapman-Kolmogorov (Simplified Version)......................................... 71<br />
Formula 5-18: Identity Matrix as an Example of a non Converging Markov Chain....................... 71<br />
Formula 5-19: Definition of Neurons .............................................................................................. 75<br />
Formula 5-20: Definition of V and F ............................................................................................... 76<br />
Formula 5-21: Determination of Error ............................................................................................ 77<br />
Formula 5-22: The Delta Rule........................................................................................................ 78<br />
Formula 5-23: Hebb Learning Rule................................................................................................ 78<br />
Formula 6-1: Regression Function and Coefficient of Determination............................................ 98<br />
VI
1 Introduction<br />
1.1 Initial Position and Problem<br />
As more and more technical devices do “mission critical” tasks within industry and<br />
medical research, monitoring of such devices is becoming increasingly important.<br />
Possible malfunctions could damage these expensive products or its contents, which<br />
can lead to very high costs (both direct and indirect costs; so called collateral<br />
damage). That is why electronic sensor based monitoring systems have become<br />
popular during the last years.<br />
One of these systems is XiltriX, a hard- and software combination from the company<br />
<strong>Contell</strong>/IKS, which is currently applied to laboratory equipment. The basic<br />
functionality is to monitor and to record temperature of fridges and/or CO 2<br />
concentration within incubators to prevent damage of goods, stored in these devices.<br />
Elementary functionalities as well as some useful tools are already implemented. The<br />
customer or <strong>Contell</strong>/IKS defines critical minimum and maximum temperature limits<br />
and as soon as a value is exceeded, the system warns by means of E-Mail or SMS.<br />
The main question is now in which direction the development of the software should<br />
continue. The idea of this study is to extend the existing “reactive” XiltriX to a more<br />
“pro-active” system that recognizes trends and notifies a person in charge before<br />
minimum and maximum critical temperature limits are exceeded. In addition to that,<br />
XiltriX should offer additional decision support to allow a person in charge to better<br />
classify the system’s condition within situations of exceptional temperature levels.<br />
After comparing XiltriX to other major monitoring products, this diploma thesis will<br />
work out some promising ideas to show the possibilities for further development. The<br />
main focus is the recorded monitoring data as currently obtained from the field by<br />
XiltriX. At the moment this data is only accessible numerically or in form of a graph.<br />
Analyzing the graphs manually in retrospect already helped to predict malfunctions,<br />
but the results rely on experience and especially instinct of <strong>Contell</strong>/IKS staff. At<br />
present it is not clear how reliable this intuitive data analysis really is. It is also<br />
problematic that with this kind of data analysis even an experienced person needs a<br />
lot of time, because graphs from every single sensor have to be looked at manually.<br />
1.2 Goals of this Study and Approach<br />
The main task of research is now to determine, whether and in which way it is<br />
possible to reliably predict malfunctions and to give decision support to the customer.<br />
1
Therefore, statistical and data mining methods shall be applied to currently available<br />
datasets.<br />
Hence, existing customer data has to be collected and analyzed. Furthermore, above<br />
mentioned methods have to be evaluated on their feasibility to offer additional<br />
decision support and to reliably predict malfunctions. This study will stick to the data<br />
currently monitored by sensors in the field and will add no new measurement data.<br />
Moreover, it is necessary to point out the increase in value for the customer for the<br />
found solutions.<br />
First step of research is to define what monitoring is about and to explain its general<br />
importance, current practice, existing problems and a requirements analysis of a<br />
monitoring system within the setting of sensor based temperature monitoring of<br />
fridges. This is done in chapter 2. Afterwards a review of XiltriX and other major<br />
monitoring products is given in chapter 3 to point out the level of compliance with the<br />
worked out requirements. The succeeding chapter 4 will review the current state of<br />
research.<br />
Based on these results in combination with literature research, chapter 5 introduces<br />
suggestions, in which way the system could be improved. Aimed results are:<br />
1. To gain additional knowledge of the cooling device’s condition from recorded<br />
datasets to offer additional decision support in case of an exceptional<br />
temperature level<br />
2. To offer a software that reliably predicts upcoming malfunctions<br />
Offering additional knowledge to the customer, regarding the equipment’s condition,<br />
leads to the idea to determine, what important information could be retrieved from<br />
currently recorded datasets. Therefore, statistical and data mining methods are<br />
evaluated on being able to offer additional information. This evaluation contains<br />
questions like:<br />
• Which statistical methods can be applied?<br />
• What knowledge gain do they offer?<br />
• What are the benefits for the operational staff?<br />
A determination whether the possible knowledge gain is sufficient to also reliably<br />
predict upcoming malfunctions and a succeeding case study will conclude this study.<br />
2
2 Sensor Based Temperature Monitoring<br />
This diploma thesis focuses on sensor based temperature monitoring of freezers and<br />
fridges within medical laboratories. Due to the functioning of a fridge and insufficient<br />
data of a high quantity of possible external influences, this setting is faced with<br />
particular problems. The Dutch company <strong>Contell</strong>/IKS supported this thesis by<br />
providing a lot of information about their sensor based monitoring system XiltriX.<br />
Moreover, <strong>Contell</strong>/IKS rendered interviews with several employees of the UMC St.<br />
Radboud (University hospital of Nijmegen, the Netherlands) possible. This customer<br />
also provided stored historical data, which enables a validation of promising<br />
analyzing methods.<br />
Based on the interview’s results, this chapter will highlight the importance of sensor<br />
based temperature monitoring of cooling devices within medical laboratories.<br />
Furthermore, typical behaviors of cooling devices as well as currently applied<br />
monitoring methods are introduced. The identification of possible problems and a<br />
requirements analysis for a perfect working monitoring system conclude this chapter.<br />
2.1 Importance of Temperature Monitoring within Medical<br />
Laboratories<br />
As already pointed out in the last chapter, sensor based temperature monitoring<br />
becomes increasingly important within many different settings. Its task is to reliably<br />
determine the condition of monitored devices. In general, a monitored device should<br />
meet the following criteria to be classified as OK [Weerdesteyn06]:<br />
1. Current state is within predefined specifications<br />
2. General behavior did not change significantly on the short-run<br />
3. General behavior did not change significantly on the long-run<br />
4. Presumably the behavior will not change significantly in the future<br />
Such a classification is very important, because a lot of medical goods have to be<br />
kept cool. Blood samples, for example, need a constant temperature of about 6°C.<br />
Changes in temperature for a longer time are dangerous to these blood samples.<br />
Even more critical are cryogenic fridges. Their samples are stored at -80°C or even<br />
cooler. A freezer’s malfunction can destroy these samples within a very short time.<br />
That has to be avoided because most of them are part of research work and<br />
irrecoverable. The contents of a fridge normally range in age from a few days to more<br />
than thirty years. That is why a breakdown of a freezer can lead to a loss of more<br />
3
than half a million Euro. As a result, a possible breakdown has to be recognized as<br />
soon as possible to be able to save the contents to other devices. [Nijmegen06]<br />
Very important to know is that events like this cannot be insured because of the high<br />
risk. Therefore, many medical laboratories and especially hospitals are very<br />
interested in an intelligent monitoring solution, which is able to recognize upcoming<br />
failures. [Weerdesteyn06]<br />
2.2 Functioning and Behavior of Freezers and Fridges<br />
In order to develop new or improve existing sensor based monitoring approaches,<br />
this section will introduce mandatory knowledge of the functioning and the behavior<br />
of cooling devices.<br />
2.2.1 General Functioning of a Fridge<br />
Although different kinds of cooling devices with different technology do exist, they are<br />
all based on the same idea, the cooling cycle. Figure 2-1 illustrates the cooling cycle<br />
of a regular household refrigerator. The basic idea of this cycle is to transport heat<br />
energy from the inside to the outside of a fridge.<br />
4<br />
2<br />
1 3<br />
Figure 2-1: Cooling Cycle of a<br />
Household Refrigerator (adapted)<br />
[UniMunich06]<br />
The exemplary cycle on the left uses a<br />
compressor. Within this cycle there is a refrigerant.<br />
It reaches the compressor (4) vaporized. The<br />
compressor compresses the gas within the<br />
condenser coil (1). Because of the generated high<br />
pressure, the vaporized refrigerant becomes liquid<br />
and emits heat. After cooling down, the refrigerant<br />
passes the expansion valve (2). The second half of<br />
the cycle is called evaporator coil (3). Within this<br />
low pressured part the liquid refrigerant starts to<br />
vaporize again. Therefore, energy is needed. It is<br />
taken from the air inside the fridge, so that the<br />
inside is cooling down. This vaporized refrigerant<br />
reaches the compressor and the cycle starts again. [UniMunich06]<br />
Fridges with a cooling cycle like that are called active fridges. Within laboratories and<br />
the industry a second class of fridges does exist. These devices do not have an own<br />
4
compressor. They are served by a centralized unit with cold air. Devices like that are<br />
called passive fridges.<br />
2.2.2 Technical Behavior of Fridges (Without External Influences)<br />
Due to the just described functioning, active cooling devices as well as passive ones<br />
do not have a constant temperature. In fact, they warm up a bit, start cooling down<br />
until they are cold enough to turn off again. Depending on the kind of cooling device,<br />
the temperature sequence looks differently. This technical behavior will be<br />
exemplified with some temperature sequences of different kinds of cooling devices to<br />
receive an impression of possible behavior.<br />
These examples were taken from the <strong>Contell</strong>/IKS XiltriX demo system, which was<br />
built up for testing and presentation purposes. This system monitors some demo<br />
fridges 24/7. As these fridges are normally empty and the doors are kept close, the<br />
collected data offers an overview of typical behavior without external influences.<br />
Figure 2-2 pictures a temperature sequence of a properly working 6°C passive fridge<br />
of about eighteen hours. Most of the time, temperature oscillates between 4°C and<br />
6°C. Moreover, nearly every cooling cycle takes about twenty minutes of time.<br />
Figure 2-2: Temperature Sequence of a Properly Working 6°C Passive Fridge [DEMO06]<br />
Figure 2-2 contains three eye-catching cycles. The first two are between 16 and 18<br />
o’clock. One cycle the fridge cools down, although the upper limit of 6°C is not<br />
reached. Three cycles later the fridge heats up to more than 7°C. The last suspicious<br />
cycle is around 2 o’clock in the morning. The fridge reaches a temperature of 6.8°C<br />
before it starts to cool down again.<br />
5
As the following graphs from other machines will show, a behavior like this has to be<br />
classified as normal. Every fridge behaves “suspiciously” sometimes without really<br />
malfunctioning. Actually, machines of the same type behave differently. Also fridges<br />
identical in construction could show different behavior for unknown reason. 1<br />
Figure 2-3: Temperature Sequence of a Properly Working 6°C Active Fridge [DEMO06]<br />
Figure 2-3 shows a temperature sequence of an active fridge. Just like the previous<br />
passive one, it should have a temperature of 6°C. In comparison to each other, the<br />
active fridge never exceeded 6°C within the shown two days. In contrast, the passive<br />
fridge exceeded 6°C about every 20 minutes. Another difference can be found in the<br />
shape of the graph. Figure 2-2 shows a more regular shape with very short cooling<br />
cycles. This is typical for a passive fridge. Figure 2-3 does not contain such a regular<br />
pattern. The cooling cycles are similar but vary in shape. Also the duration of the<br />
passive fridge’s cooling cycle is more than twice as short as the one from the active<br />
device, which is about 43 minutes.<br />
Most results of this comparison cannot be generalized because counterexamples do<br />
exist [DEMO06]. The only indication for a passive fridge is the regular pattern with<br />
very short cooling cycles and a larger deviation. All other differences could be the<br />
other way round when comparing two other 6°C fridges. 2<br />
2.2.3 Technical Behavior of Freezers (Without External Influences)<br />
Looking at freezers even complicates the situation. Figure 2-4 pictures the<br />
temperature sequence of a -80°C active freezer. Although it operates slightly above<br />
1 Reasons are unknown because of the lack of information problem. (See section 2.4.1 for details)<br />
2 See [DEMO06], [UMC06] for further details<br />
6
the specified value, it works very accurately because total deviation is less than 2°C<br />
within the displayed time of five days. On the other hand, the graph contains a trend.<br />
Within four days the daily mean increased more than half a degree. An event like this<br />
has to be recognized and surveyed, when classifying the system’s behavior.<br />
Figure 2-4: Temperature Sequence of a -80°C Active Freezer [DEMO06]<br />
Figure 2-5 shows the behavior of a -80°C passive freezer. This one behaves totally<br />
different. The data does not contain a trend but oscillates much more than the one<br />
above. Furthermore, -80°C is never reached and total deviation is more than 8°C, so<br />
that temperature exceeds -70°C regularly.<br />
Figure 2-6 shows another kind of freezer. The red lines signalize door openings. The<br />
special thing about that device is that it needs a regeneration cycle every few hours<br />
due to technical reasons. Compared to the previous datasets, the oscillation is much<br />
higher and the shape of the graph is more irregular. But as this is normal behavior for<br />
this kind of freezer, it should be classified as OK.<br />
2.2.4 Behavior in Practice<br />
As mentioned at the beginning of this section, the exemplified temperature<br />
sequences originate from the <strong>Contell</strong>/IKS demo system so far. Since these monitored<br />
cooling devices are empty and not in use, they are not externally influenced by users.<br />
In practice, a cooling device can be influenced by a large quantity of variables. 3<br />
3 See section 2.4.2 for details<br />
7
Figure 2-5: Temperature Sequence of a -80°C Passive Freezer [DEMO06]<br />
Figure 2-6: Temperature Sequence of a -20°C Active Freezer [DEMO06]<br />
Hence, the temperature sequence of a corresponding monitored cooling device<br />
changes to a more irregular pattern. Types and origins of external influences will be<br />
identified in section 2.4.2. Up to then, the following example should just give a<br />
general idea of temperature sequences in practice.<br />
Figure 2-7 shows the behavior of a properly working cryogenic -180°C freezer in<br />
practice. The data originates from the UMC St. Radboud (University hospital of<br />
Nijmegen, the Netherlands) and represents typical behavior for that kind of devices. 4<br />
In contrast to previous examples, this figure pictures a larger time slice. These ten<br />
months are chosen to give an impression of practical behavior on the long run.<br />
Recognizable is a baseline at about -183°C. Due to the different scaling, the figure<br />
does not picture the single cooling cycles any more, although they do exist. Instead<br />
4 The university hospital of Nijmegen provided their datasets only as a copy from their XiltriX system.<br />
That is, why Matlab was used to draw this graph. Beside the slightly different appearance, the data<br />
would look the same in Xiltrix.<br />
8
of this, over a hundred irregular peaks are pictured that cannot be traced back to<br />
typical technical behavior. In fact, not a single peak is caused by a technical<br />
malfunction [Nijmegen06]. Hence, monitoring systems have to be able to figure out,<br />
whether such a peak is caused by a technical malfunction or by external influences.<br />
Figure 2-7: Temperature Sequence of a Cryogenic Freezer in Practical Use [UMC06]<br />
2.2.5 Behavior in Case of a Malfunction<br />
Unfortunately, the provided 36 datasets of the UMC St. Radboud do not contain a<br />
single technical malfunction [UMC06], [Nijmegen06]. In fact, most cooling devices<br />
operate for years without having a single failure, which leads to a very low probability<br />
of a technical malfunction. Hence, it is not possible to introduce a sample pattern<br />
here. Nevertheless, the following criteria are identified to hint a malfunction in case of<br />
not being influenced externally [Weerdesteyn06]:<br />
1. Form and shape of the temperature sequence changes significantly on the<br />
short-run<br />
2. Form and shape of the temperature sequence changes significantly on the<br />
long-run<br />
3. Temperature exceeds the range of normal operation<br />
9
A temperature exceeding without external influence is a definite indication of a<br />
cooling device’s malfunction. But most technical failures are caused by compressor<br />
breakdowns. Usually, such a breakdown does not appear suddenly but predictable<br />
because form and shape of the corresponding temperature sequence starts to<br />
diversify, before the compressor actually breaks down. An early recognition could<br />
allow predictive maintenance [Weerdesteyn06].<br />
2.3 Current Practice of Sensor Based Temperature Monitoring<br />
The basic idea of currently applied sensor based temperature monitoring is to attach<br />
a sensor to a cooling device. The collected information is used to evaluate the<br />
condition of a monitored fridge. The assumption behind this idea is that a cooling<br />
device is malfunctioning or at least has to be looked at, when a regular temperature<br />
range is exceeded.<br />
Based on this assumption, the current main approach of temperature monitoring is to<br />
define critical minimum and/or maximum temperature limits, which may not be<br />
exceeded. This idea leads to three different kinds of temperature monitoring in<br />
current practice:<br />
1. Temperature verification in retrospect<br />
2. Online comparison of current temperature values to a specified range<br />
3. Online comparison and data analysis in retrospect<br />
In general, the temperature verification in retrospect is based on a single indication<br />
sensor that operates as an isolated application. The task of that kind of sensor is just<br />
to indicate, whether a temperature exceeding occurred during monitoring time.<br />
Furthermore, advanced sensors are able to indicate the duration of exceeding or the<br />
most critical temperature value. This approach only offers very few information and is<br />
not designed to avoid critical temperatures but to report them in retrospect. 5 Hence,<br />
this approach is often used within the setting of transportation of frozen goods but not<br />
suitable for monitoring important samples that may not defrost in any case.<br />
The second kind of temperature monitoring is often found in practice. The basic idea<br />
is to just compare the actual measurement values to the predefined temperature<br />
range within short time intervals. In case of a temperature exceeding, an alarm is<br />
raised immediately to notify a person in charge. In contrast to the first introduced kind<br />
of temperature monitoring, this one can operate as an isolated application as well as<br />
5 See section 3.5.1 for a sample product<br />
10
a centralized one. An isolated application is characterized by using its own features<br />
to raise an alarm, like built-in flashlights or sirens. A centralized application (e.g.<br />
XiltriX) transfers information of critical situations to a centralized unit that displays the<br />
current status of all monitored devices at one place.<br />
The third kind of temperature monitoring is an extension to the just presented one.<br />
Besides comparing actual temperature values to predefined intervals, temperature<br />
sequences of the single devices are stored. Again, this kind of temperature<br />
monitoring can be implemented as an isolated application as well as a centralized<br />
one. The gained historical temperature sequences enable data analysis in retrospect<br />
to obtain changes in behavior over time.<br />
Up to now, this data analysis is kept very simple. Beside basic visualization<br />
possibilities to evaluate the behavior manually or some provided statistical measures,<br />
current temperature monitoring products in the market do not contain more complex<br />
analyzing methods. 6<br />
Hence, the main task of this diploma thesis is to find additional analyzing methods to<br />
offer more precise status information of monitored cooling devices. To be able to do<br />
that, the next section will identify problems and potential sources of error, current<br />
sensor based temperature monitoring is faced with.<br />
2.4 Problems and Potential Sources of Error<br />
Data analysis (e. g. statistics) can lead to two different kinds of error<br />
([Scharnbacher04], p. 85):<br />
1. Error of first kind<br />
2. Error of second kind<br />
Based on a null hypothesis (H 0 = Cooling device is OK), four different cases are<br />
possible as pictured in Table 2-1. The aimed goal, within the setting of temperature<br />
monitoring of cooling devices, is the ability to always reach the right decisions. As<br />
referred in section 2.1, the task of monitoring within this setting is mission critical.<br />
Hence, an error of second kind has to be avoided in any case. In contrast, an error of<br />
first kind is only a false alarm that is indeed disturbing but not dangerous.<br />
6 See chapter 3 for details<br />
11
Table 2-1: Error of First and Second Kind<br />
H 0 is correct H 0 is wrong<br />
Acceptance H 0 Right decision Error of second kind<br />
Rejection H 0 Error if first kind Right decision<br />
The succeeding subsection will introduce the major problem, sensor based<br />
temperature monitoring is faced with and its consequences on first and second error.<br />
2.4.1 The Lack of Information Problem<br />
Currently the major problem within the setting of sensor based temperature<br />
monitoring of cooling devices is a lack of information. All well known systems in the<br />
market only attach a single temperature sensor to a fridge. That is why in most cases<br />
only the current temperature of a cooling device is available for analyzing purposes.<br />
Advanced systems like the below introduced XiltriX offer, for instance, the possibility<br />
to add an additional door sensor. So, there is at least a second piece of information<br />
available.<br />
In fact, there are many factors that have an influence on the temperature inside a<br />
fridge. Figure 2-8 specifies some of the factors and illustrates the problem of current<br />
systems. Of course, it would be possible to add additional sensors to every<br />
monitored device. But their quantity is always kept small to minimize expenses<br />
[Nijmegen06]. For example, every temperature sensor for XiltriX causes additional<br />
costs of about 500€ [Weerdesteyn06]. This leads to a one sensor usage, sometimes<br />
in combination with a door opening sensor.<br />
This lack of information problem causes the cooling device to be a black box and<br />
disables the finding of real causes of temperature deviations. Especially the needed<br />
information, whether a fridge is significantly externally influenced within a certain time<br />
cannot be obtained for sure. 7 This problem leads to potential sources of error, when<br />
analyzing temperature sequences. As an error of second kind has to be avoided in<br />
any case, the quantity of first kind errors increases within situations of unknown<br />
influences.<br />
7 See section 2.4.2 for details<br />
12
Figure 2-8: Lack of Information Problem Caused by Sensor Based Temperature Monitoring<br />
A second problem even increases the lack of information. It is caused by the<br />
unknown behavior between two single measuring points. Figure 2-9 exemplifies a<br />
rising and falling of temperature between two of these points. Analyzing this data in<br />
retrospect would disregard this actual behavior. Furthermore, a graphical and a<br />
numerical analysis would assume a constant temperature within this interval, as<br />
indicated by the red dashed line.<br />
Figure 2-9: The Problem of Unknown Behavior between Two Single Data Points<br />
13
2.4.2 Potential Sources of Error<br />
A change in cooling behavior could indeed be caused by a technical malfunction. But<br />
as the probability for such a malfunction is very small 8 , a change is normally caused<br />
by other external influences. Due to the lack of information problem, the reason for<br />
an abnormal behavior cannot always be obtained. This subsection identifies common<br />
influences, which can lead to false alarms. They can be divided into two groups:<br />
1. Environmental influences<br />
2. User interaction<br />
Environmental influences are rather rare. Basically, all imaginable environmental<br />
changes could influence the behavior of a cooling device. But in reality, only two<br />
common factors are identified that really change the temperature sequence, although<br />
the technical condition remains the same: [Weerdesteyn06]<br />
1. A significant change in room ambient temperature<br />
2. A power failure<br />
A change in room ambient temperature generally changes the warming-up and<br />
cooling-down behavior of freezers and fridges, so that the changing temperature<br />
sequence of the corresponding cooling device could lead to a rejection of H 0 . This<br />
decision has to be classified as an error of first kind. In contrast, a raised alarm<br />
caused by a power failure should be classified as right decision because a situation<br />
like this would endanger the stored samples, although the technical condition of the<br />
cooling device is still OK.<br />
But as these environmental influences are very infrequent, the main focus has to be<br />
kept on changes in behavior because of user interaction. In general, this behavior is<br />
not measured. Only some monitoring products attach an additional door sensor to<br />
monitored devices to recognize at least door openings. In fact, door openings<br />
influence the cooling behavior significantly, because warm air enters the fridge.<br />
Especially freezers heat up very fast, so that an open door leads to an alarm within<br />
very short time. [Nijmegen06]<br />
Aside from door openings, the condition of a newly inserted sample as well as the<br />
filling level of a cooling device is a significant influencing factor. An insertion of warm<br />
samples leads to an enduring heating up, even if the door is already closed again.<br />
8 See section 2.2.5 for details<br />
14
Moreover, the fridge’s filling level can vary the cooling-down time, so that form and<br />
shape of the corresponding temperature sequence changes, although the technical<br />
condition remains the same.<br />
Beside these general existing sources of error, the current practice is faced with<br />
additional problems that originate from the currently applied method, which was<br />
introduced in section 2.3.<br />
2.4.3 Methodological Problems<br />
The presented graphs within section 2.2 already exemplified many different<br />
behaviors of fridges and freezers. These examples were chosen to show the difficulty<br />
of an accurate classification of different kinds of behavior as normal operation or<br />
malfunction.<br />
The currently applied method to predefine critical temperature limits only allows a<br />
classification that is based on the actual temperature value. 9 Hence, as soon as<br />
temperature rises above the predefined maximum or falls below the predefined<br />
minimum, the cooling device is classified as malfunctioning. This method could<br />
indicate a bad technical condition of a fridge. But due to the lack of information and<br />
other possible error sources, it is impossible to prove a malfunction by using this<br />
method.<br />
Since an error of second kind has to be avoided in any case, H 0 has to be rejected<br />
every time, temperature limits are exceeded. This leads to a very high number of<br />
errors of first kind, because of the very low probability of a real technical<br />
malfunction. 10<br />
Beside this high number of false alarms, another methodological problem does exist.<br />
As mentioned in section 2.2.5, most malfunctions occur slightly, so that they could be<br />
recognized before temperature is exceeded. Such a change in form and shape of a<br />
temperature sequence is not recognized by the current method. Hence, situations<br />
like that lead to an error of second kind because H 0 is accepted, although the system<br />
starts to malfunction.<br />
Also the required recognition of changes in behavior on the long-run is only possible<br />
to some extent with the existing method. Typically, a change is bound to significant<br />
9 See section 2.3 for details<br />
10 See section 2.2.5 for details<br />
15
higher or lower temperatures. In that case, the temperature exceeds one of the<br />
predefined limits regularly and H 0 is rejected.<br />
Problematic are slight changes within the defined temperature range. A small<br />
increase of mean temperature, for instance, typically also increases the peak values<br />
and leads to a temperature exceeding. Of course, a small increase of mean<br />
temperature with unchanged peaks will not be recognized. This would cause an error<br />
of second kind again, because H 0 is accepted, although the monitored device could<br />
already malfunction.<br />
Beside all these problems, defining appropriate critical temperature limits is the<br />
greatest methodological problem. On the one hand, predefined limits a bit outside the<br />
typical temperature range would decrease the error of second kind, because already<br />
slight changes in temperature lead to a rejection of H 0 . On the other hand, nearly<br />
every external influence also leads to a rejection of H 0 , which has to be classified as<br />
error of first kind in nearly all cases.<br />
In practice, critical temperature limits are normally defined with a higher span to<br />
reduce the quantity of false alarms, caused by external influences. As mentioned<br />
before, this behavior increases the probability of an error of second kind. 11<br />
To be able to improve this unacceptable situation, the next section will determine a<br />
requirements analysis as basis for finding new methods.<br />
2.5 Aimed Goal and Requirements Analysis<br />
As mentioned in section 1.2, the aimed goal is to improve the current situation by<br />
offering decision support. This decision support can be offered by providing more<br />
information to a person in charge than just current temperature and status<br />
information of an optionally installed door opening sensor. This additional information<br />
should enable the responsible person, to classify the current behavior of a cooling<br />
device more precisely.<br />
As the attachment of additional sensors shall not be regarded by this diploma<br />
thesis 12 , the only way to gain additional information of a cooling device is the analysis<br />
of stored historical temperature sequences. Many higher developed systems already<br />
11 Figure 2-7 on page 9 pictures that problem quite well. The red dashed line marks the predefined<br />
maximum critical temperature. As long as this temperature is not exceeded, the null hypothesis H 0 is<br />
accepted, even if the cooling device is already malfunctioning.<br />
12 See section 1.2 for details<br />
16
store this kind of data but only offer basic visualization possibilities and sometimes<br />
basic statistical summarizations. Moreover, systems like that currently only allow data<br />
analysis by hand.<br />
This leads to a high amount of stored historical data with currently very few use. The<br />
main idea is now to test statistical and data mining methods on applicability to<br />
improve the current situation of rare information. Especially reliable answers on the<br />
given criteria from the beginning of chapter 2 would offer great decision support, as<br />
pictured in Figure 2-10.<br />
Figure 2-10: Estimated Answers of Statistics and Data Mining<br />
One hundred percent reliable answers to the questions on the right would allow a<br />
perfect classification of cooling devices as OK or malfunctioning. But even if the<br />
answers could only be given with a lower reliability, a possible knowledge gain could<br />
at least support the decision of the current technical condition and put it on a larger<br />
basis than just the current temperature.<br />
Beside these four criteria, section 2.1 identified another two important requirements.<br />
Since the stored samples are normally high valued and easy to destroy, a monitoring<br />
approach has to be able to identify failures as soon as they are recognizable.<br />
Because an early detection leads to additional time to save stored samples to other<br />
fridges. Moreover, it must be possible to avoid an error of second kind in any case.<br />
17
According to section 2.2.5 another requirement is the ability to recognize external<br />
influences, because only changes that cannot be traced back on these influences<br />
have to be classified as malfunction. The following list summarizes again the<br />
requirements analysis:<br />
• The monitoring approach is able to classify the current state of a monitored<br />
device<br />
• The monitoring approach is able to recognize significant changes of general<br />
behavior on the short-run<br />
• The monitoring approach is able to recognize significant changes of general<br />
behavior on the long-run<br />
• The monitoring approach is able to predict upcoming failures<br />
• The monitoring approach is able to identify failures as soon as they are<br />
recognizable<br />
• The monitoring approach is able to avoid an error of second kind in any case<br />
• The monitoring approach is able to recognize external influences<br />
Based on these requirements, chapter 5 will introduce promising statistical and data<br />
mining methods, which will be tested on feasibility in the following. But before that,<br />
the next chapter will introduce XiltriX and other major sensor based monitoring<br />
products and will review them according to the just worked out requirements.<br />
18
3 Current Monitoring Systems<br />
The last chapter pointed out the existing problems of the temperature monitoring task<br />
and limitations of the current approach of just setting critical temperature limits. This<br />
chapter will introduce currently available monitoring systems to identify existing<br />
problems. The main focus is kept on XiltriX, but section 3.5 will review other products<br />
as well and will point out differences.<br />
XiltriX is a monitoring system that is developed by the Dutch company <strong>Contell</strong>/IKS. It<br />
consists of a combination of hard- and software, which realizes the basic tasks of<br />
monitoring in the setting of medical laboratories. The basic idea is to attach sensors<br />
to cooling devices and to collect the measurement data on a centralized web server.<br />
In case of an exceeding of a predefined temperature limit, the system is able to notify<br />
a person in charge locally and remotely by using flashlights or SMS for instance.<br />
The basic development of this system started in 1991. In that year the company<br />
IKS 13 published their first monitoring system. It was named JS and was built in<br />
cooperation with several Dutch blood banks and aqua labs. During the years the<br />
system was improved by implementing user made suggestions. After releasing JS 8,<br />
JS 16, JS 32, JS 64 and JS 2000, IKS decided to rebuild the system completely by<br />
using modern hard- and software possibilities and the gathered knowledge from the<br />
JS development. This rebuild was published in 2003 as JS 2003. Beside the change<br />
of name to XiltriX and some minor improvements, this version is still current state.<br />
[Weerdesteyn06]<br />
3.1 XiltriX’s Technical Basis<br />
Figure 3-1 pictures a flowchart that introduces the general approach of most sensor<br />
based temperature monitoring systems including XiltriX. First step of monitoring is<br />
the collection of available data. Afterwards, this data is stored to a database for<br />
documentation purposes. As described in section 2.3 the current state of a monitored<br />
cooling device is only identified by comparing the current temperature to the<br />
predefined critical temperature limits. As long as the measured temperature is within<br />
the predefined limits the monitored device is classified as OK. Otherwise, the<br />
monitored device is classified as malfunctioning and a person in charge is notified.<br />
As monitoring is a continuous task in general, this procedure is repeated every time a<br />
predefined time interval is exceeded. This is indicated by the black dashed line.<br />
13 The companies <strong>Contell</strong> and IKS merged in January 2006<br />
19
Figure 3-1: Flowchart of the Temperature Monitoring Task<br />
Section 2.4.1 introduced the lack of information problem that causes many false<br />
alarms. Figure 3-1 illustrates that even parts of the known data remain unused for<br />
classification purposes. This is indicated by the green and red arrows. Although the<br />
whole available data is collected and stored, only the current temperature and the<br />
predefined critical temperature limits are used by XiltriX to determine the cooling<br />
20
device’s condition. Especially the stored historical temperature data is not used. Only<br />
the user has the possibility to analyze the collected data manually.<br />
The next sections will introduce the possibilities XiltriX is currently offering. This<br />
description is divided into three parts:<br />
1. XiltriX’s components (this section)<br />
2. XiltriX’s basic functionality (section 3.2)<br />
3. XiltriX’s additional features (section 3.3)<br />
3.1.1 Basic Components of a XiltriX Installation<br />
A basic XiltriX installation consists at least of a web server, one or more power<br />
supplies, one or more substations (called OS-4’s) and several temperature sensors<br />
(called PT100’s). Figure 3-2 pictures a schematic drawing of the connections<br />
between the single units of XiltriX.<br />
Although all parts are mandatory for a working XiltriX system, the web server is the<br />
most important one, because it contains the XiltriX software and stores the<br />
measurement data. The software is provided as a java applet. That means that no<br />
local installations are necessary. Every client just needs a web browser like the<br />
Microsoft Internet Explorer 6 and a connection to the local area network. In case of a<br />
web server’s breakdown, the whole XiltriX system will discontinue working.<br />
Also important are the OS-4’s. They are installed near the device that should be<br />
monitored. Every single of these substations offers the possibility to attach up to four<br />
sensors and up to four digital devices like switches, sirens or flashlights. 14<br />
Furthermore, it is possible to connect up to ten substations in a row.<br />
The connection between web server and substations is made by the use of the<br />
system’s power supplies. Each power supply is capable to energize five rows of<br />
substations with a maximum number of ten devices per row. Furthermore, the same<br />
cable is used to relay the measurement data from the connected OS-4’s to the web<br />
server, so that no additional cable is needed.<br />
14 See section 3.3.1 for further details<br />
21
Figure 3-2: XiltriX - Schematic Drawing of an Installation with Basic Components<br />
3.1.2 Other Installation Possibilities<br />
Typically, devices that should be monitored by XiltriX are spread all over the building.<br />
That is why a hardware installation of XiltriX is bound to a lot of wiring. Therefore,<br />
XiltriX offers two additional possibilities of connecting substations to the web server:<br />
1. Usage of an existing local area network<br />
2. Usage of a wireless connection<br />
Figure 3-2 demonstrates that normally only the web server is connected to the<br />
existing company network to publish the collected information. But this network can<br />
also be used to transport the measurement data directly from the substation to the<br />
web server. Therefore, it is necessary to convert the substation’s signal to a TCP/IP<br />
signal, which is compatible with a local area network signal. This could be done with<br />
22
a converter that is also available for XiltriX. Of course, a substation connected like<br />
this needs an own power supply because the network does not provide energy.<br />
The second additional possibility is the usage of a wireless LAN. Similar to the just<br />
introduced approach, the substation is equipped with an own power supply. The only<br />
difference is the way of sending the data to the web server. Instead of using a<br />
converter to use the local area network, an additional wireless LAN is installed. This<br />
method saves a lot of wiring, but it is less reliable than a cable connection, due to<br />
existing radio interferences within hospitals. [Weerdesteyn06]<br />
3.2 XiltriX’s Basic Functionality<br />
The last section focused on the general idea and the technical basis of XiltriX. This<br />
section will now introduce XiltriX’s basic functionality. This means, that these features<br />
are mandatory for a sensor based monitoring system and nothing unique. Section 3.3<br />
will introduce special features that were implemented to solve current limitations of<br />
the current monitoring approach.<br />
Figure 3-1 pictures the flow of information. A person in charge is notified in case of<br />
exceeding the predefined temperature limits. Beside that, information can be<br />
obtained from two additional sources as indicated by the dashed arrows:<br />
1. A display that shows current data<br />
2. The database that contains the historical temperature data<br />
XiltriX offers both possibilities. Figure 3-3 pictures the main screen of XiltriX. It gives<br />
an aggregate overview of current data of all monitored devices and can be accessed<br />
on every computer within the network. Most important is the white table in the middle<br />
of the screen because it contains machine based data. Depending on the system’s<br />
configuration this table shows current data from machines of one or more<br />
departments. The first column represents the status of an optional connected door<br />
sensor. Empty rows indicate a missing of this sensor.<br />
Furthermore, a unique identification number and a description are assigned to every<br />
monitored device, which is displayed in the second and the fourth column. The third<br />
column indicates the activation of the high resolution mode by showing an asterisk.<br />
23
This mode forces the system to store a measuring point every single minute instead<br />
of every 15 minutes. 15<br />
Column number five shows the last measured value. Depending on the classified<br />
current state of the attached device this value can be up to 15 minutes old within<br />
normal operation mode. To be able to classify such temperature values, critical limits<br />
are set, as described in section 2.3. These limits can be seen in column number<br />
seven and eight for every single device.<br />
Figure 3-3: XiltriX - The Main Screen [DEMO06]<br />
In Addition to these limits, a delay time can be defined for minimum and maximum<br />
temperature alarms in column six and nine. After having passed a critical<br />
temperature limit, the system is waiting for a predefined time, before it alarms the<br />
person in charge. The last two columns contain date, time and the most critical<br />
temperature value of a current alarm. Entries within these two columns can only be<br />
cleared by an alarm reset. 16<br />
15 See section 3.2.1 for details<br />
16 See section 3.2.2 for details<br />
24
To indicate important events within this table XiltriX uses a color code to highlight<br />
exceptional temperature values and alarm messages. The colors and their meanings<br />
are listed in Table 3-1.<br />
Table 3-1: Listing of Existing Table Color Codes and Their Meaning<br />
Color<br />
Meaning<br />
Orange Temperature exceeded the set minimum or maximum limit value or the<br />
door is open (within delay time)<br />
Blue Temperature exceeded the set minimum limit value<br />
(delay time has passed)<br />
Red Temperature exceeded the set maximum limit value<br />
(delay time has passed)<br />
Yellow An alarm has been canceled but not reset yet<br />
Purple An alarm has been reset but an activation delay is configured and active 17<br />
Below the just described table there is another smaller one, which offers information<br />
of digital input devices that can be bound but do not have to be bound to a single<br />
machine. Figure 3-3 shows an installed start/stop switch for fridge number 4. 18 This<br />
switch can be used to stop the monitoring of this device. In case of pushing the<br />
corresponding button, the monitoring of this device will be stopped. But at the same<br />
time an alarm would go off because the delay time for this device is set to zero. This<br />
can be seen in the DS column.<br />
Of course, a scenario like that does not make sense in practice, but this data is taken<br />
from the demo system. It is only made up for testing purposes and should only show<br />
the technical possibilities of XiltriX. In reality, a button like this could be useful, for<br />
instance, to disable alarms for cleaning purposes. Of course, a delay time higher<br />
than zero minutes is necessary. Other attachable switches and their functionality will<br />
be presented within section 3.3.1.<br />
Another important element of the main screen is the status bar above the just<br />
described tables because it offers an overview of the whole system. Monitored<br />
devices can be grouped into up to 16 different departments. The color of the<br />
corresponding department button indicates, whether every machine within this group<br />
is operating as expected or not. Table 3-2 gives an overview of the existing colors<br />
and their meanings.<br />
17 See section 3.2.2 for details<br />
18 See section 3.3.1 for details<br />
25
Table 3-2: Listing of Existing Status bar Color Codes and Their Meaning<br />
Color<br />
Meaning<br />
Grey Button is not in use or configured<br />
Green No alarms are activated<br />
Red An alarm has been activated within this section<br />
Yellow An alarm has been canceled but not reset yet<br />
Blue The SMS and/or E-Mail module is connected to XiltriX but turned off<br />
(only available for SMS and E-Mail button)<br />
The last six buttons offer additional information. “Tech” indicates a technical problem<br />
within XiltriX. This could be a broken cable to one of the sensors or one of the<br />
substations 19 for instance. “M1” and “M2” symbolize master alarm 1 and 2.<br />
Depending on the system’s configuration, it is possible to assign special kinds of<br />
serious failures to these buttons. 20 “Sys” reports system alarms. Consequently, it is<br />
similar to the technical alarm but it reports minor serious problems. In standard<br />
configuration it is not in use because no parts of less importance are attached to the<br />
system.<br />
The two remaining buttons are called “SMS” and “EMAIL”. They indicate the status of<br />
the optional available remote alert modules. A red colored SMS button, for example,<br />
indicates that an SMS was sent due to a just active alarm.<br />
All described elements together allow a very quick overview of the current system.<br />
Figure 3-3 on page 24, for instance, pictures a system within normal operation. All<br />
monitored devices are grouped to a single department, which does not report an<br />
alarm. “Tech” and “M1” also indicate a well running system within specifications. In<br />
addition to that, the system is capable of sending SMS and E-Mail. But the blue color<br />
indicates that in case of a malfunction these features will not be used because they<br />
are turned off within the configuration of XiltriX.<br />
3.2.1 Current Possibilities to Display and Analyze Stored Data<br />
As already pointed out in section 2.3 and section 3.2, the recorded data would allow<br />
extended data analysis. But currently available systems only offer basic manual<br />
analysis. The current version of XiltriX only offers the following possibilities:<br />
19 Substations are explained in section 3.1<br />
20 See section 3.3.3 for details<br />
26
1. Display stored data in table form<br />
2. Display stored data in graphical form<br />
3. Display basic statistical information<br />
Figure 3-4 pictures the first offered possibility to display stored data in numerical<br />
form. This table contains all available information of a selected device. The first two<br />
columns contain information of date and time of storage. This information is saved in<br />
local time (Central European Summer Time) and GMT (Greenwich Mean Time).<br />
Normally, one of these two columns would be enough. But as XiltriX is certified<br />
according to ISO 9001:2000, both columns are necessary.<br />
Columns three and four contain information of the measured temperature. “Raw<br />
value” is the raw digital measured value that is received by an attached sensor. This<br />
value is converted to Celsius scale and stored as “evaluated value”. The needed<br />
conversion factor is about 100:1. To identify the exact factor, a calibration is done<br />
regularly with every single sensor. Moreover, “lo” and “hi” contain the set critical<br />
temperature limits at storage time. In combination with the “evaluated value” it is<br />
possible to analyze in retrospect the number of alarms during a specified time period.<br />
Figure 3-4: XiltriX - Stored Data in Table Form [DEMO06]<br />
27
The other columns may remain empty because they offer information of additional<br />
attachable sensors and switches. If, for instance, a door sensor is installed, the<br />
accordant column will contain a Boolean value. “0” indicates a closed and “1”<br />
indicates an open door. As this section shall only give an overview of the stored data,<br />
the other possible switches will be explained within section 3.3.1.<br />
Looking at data in graphical form offers a much better overview of past behavior than<br />
numerical data. A comparison of Figure 3-4 and Figure 3-5 demonstrates this<br />
difference. Both of them contain the same dataset within the same time range, which<br />
can be chosen freely. The biggest problem of the table form is the limited number of<br />
values that can be displayed on one screen without using the scroll bar. The graph<br />
can be scaled to fit the screen, so that the whole behavior of the chosen time range<br />
can be seen immediately. This allows an evaluation of behavior of a cooling device<br />
within very short time. It is easy to see, that the exemplified fridge has a regular<br />
pattern with only very few outliers. This information would be hard to obtain without<br />
this visual help.<br />
Figure 3-5: XiltriX - Stored Data in Graphical Form [DEMO06]<br />
28
This way of visualizing data is the most often way, past time data is looked at<br />
[Weerdesteyn06]. Due to missing additional decision support, this is currently the<br />
only way “data analysis” can be done. In fact, the person in charge has to analyze<br />
the behavior of the different monitored devices by having a look at their graphs. In<br />
case of an uncommon behavior, it is necessary to look at this specific graph more<br />
frequently.<br />
The third way, data can be displayed by the use of XiltriX, is basic statistics. It offers<br />
additional information to determine the current condition of a monitored cooling<br />
device. Although statistical analysis is a powerful method to detect changes in<br />
behavior, the current approach is too simple, as described in the following. 21<br />
Figure 3-6: XiltriX – Available Statistical Information [DEMO06]<br />
Figure 3-6 presents the currently available statistical data. Again, the calculations<br />
refer to the same dataset as above. The single columns contain the channel number<br />
of the connected sensor as well as the minimum, the maximum and the average<br />
temperature value. Furthermore, the standard deviation, the number of occurred<br />
21 See section 5.10 for new approaches<br />
29
alarms and the mean kinetic temperature (MKT) is given. The calculation of these<br />
values is based on the stored measuring points.<br />
But these stored measuring points may contain irregular time ranges due to XiltriX’s<br />
storage behavior. In standard configuration, the system updates every measured<br />
temperature value once a minute. Furthermore, every 15 minutes a measuring point<br />
is stored on the web server. In case of a temperature exceeding of one of the<br />
monitored cooling devices, the saving behavior of XiltriX changes for this particular<br />
device. As long as the alarm is not reset 22 , a measuring point is stored every single<br />
minute.<br />
An installation of a door sensor results in additionally stored measuring points.<br />
Beside the regularly stored points, a measurement is added every time the status of<br />
a door sensor changes. If, for instance, the door of a fridge is opened five times<br />
within one minute, ten measuring points will be saved for this particular device. In<br />
contrast, it is possible that no single data point is stored within 14 minutes in case of<br />
no door opening and an uncritical temperature.<br />
Due to that irregular storage behavior of XiltriX, the computed statistical values are<br />
not implicitly correct, because temperature values are not weighted over time during<br />
calculation. Only the offered mean kinetic temperature considers the different time<br />
frames and provides a hundred percent correct results.<br />
3.2.2 Documentation of Occurred Alarms<br />
In addition to temperature data, events are also stored to the database. Divided into<br />
several log files, logins as well as configuration changes are documented. The most<br />
important log file contains information about occurred alarms and their reasons.<br />
As already indicated in section 3.2, every time<br />
an alarm goes off, it has to be acknowledged<br />
by a person in charge. This acknowledgement<br />
is done by an alarm reset. Figure 3-7 pictures<br />
the alarm documentation functionality. It offers<br />
the possibility to document the reason of an<br />
alarm as well as performed actions to solve the<br />
occurred problem. Beside some available<br />
presets, it is also possible to enter a reason as<br />
22 See section 3.2.2 for details<br />
Figure 3-7: XiltriX - Alarm<br />
Documentation Window [DEMO06]<br />
30
free text. Moreover it is possible to define an activation delay in minutes. Within that<br />
time no new alarm will go off.<br />
This stored information can be useful to evaluate the condition of a monitored cooling<br />
device. Many alarms due to open doors, for instance, indicate user misbehavior. By<br />
contrast, many alarms due to repair work or maintenance indicate unreliability of the<br />
monitored device. Because of the mentioned problem of unknown influences in<br />
section 2.4, this kind of documentation is very important to get at least some<br />
information of a freezers condition. A significant high number of repair and<br />
maintenance activities lead to the assumption that the monitored device has to be<br />
replaced.<br />
Unfortunately, this documentation possibility is rarely used in real practice. Because<br />
of the high quantity of false alarms, the employees get tired of documentation and<br />
just leave the input fields blank when resetting an alarm.<br />
3.3 XiltriX’s Additional Features<br />
Up to now, the introduction of XiltriX was focused on basic functionality that should<br />
be mandatory for every kind of sensor based temperature monitoring system. In the<br />
following, some additional features will be focused on, that were implemented to<br />
solve some of the existing problems. 23<br />
3.3.1 Different Types of Attachable Digital Switches<br />
One of these features is the opportunity to attach digital switches to XiltriX. These<br />
switches can be coupled to a certain monitored device but do not have to. In general,<br />
there are three configuration possibilities for every digital switch:<br />
1. A door switch<br />
2. A start/stop switch<br />
3. A high/low switch<br />
First of all, it can be configured as a door switch. Coupled to a certain monitored<br />
device, it signalizes every single door opening and closing to XiltriX. Beside the<br />
regularly stored measuring points, an additional value is written to the database in<br />
case of a switching. If a door switch is not coupled to a certain device, it can be used,<br />
for instance, to monitor the room door. If someone opens that door, this information<br />
23 See section 2.4 for details<br />
31
will be displayed at the bottom of the main screen. Depending on the configuration,<br />
an alarm could also go off.<br />
Aside from that, it is possible to configure a start/stop switch. This could be useful, for<br />
example, if regular maintenance work has to be done to certain monitored devices. If,<br />
for instance, a freezer is emptied and turned off, such a switch could stop monitoring<br />
or just suppress alarms.<br />
The third usage possibility is a high/low switch. It enables the person in charge to<br />
switch between two limit configurations. If, for example, a room door is opened, the<br />
chosen limits could be set to a higher span as long as the door is open. XiltriX offers<br />
additional ways to adapt limits to different kind of situations. This will be explained in<br />
the following section 3.3.2.<br />
3.3.2 Time-Dependent Limit Settings<br />
The just introduced high/low switch already offers the possibility to select one of two<br />
limit configurations by just pressing a button. But the determination of critical limits is<br />
still one of the greatest problems. Section 2.4.3 introduced the problem, of setting<br />
critical temperature limits. Especially, a lot of door openings and other unknown user<br />
behavior caused many false alarms. As already pointed out, there is currently no way<br />
in practice to solve this problem.<br />
That is, why XiltriX offers another workaround beside the high/low switch. This<br />
workaround is based on the assumption, that a monitored device is not faced 24/7<br />
with the same influences. Typically, there is a high quantity of external user<br />
influences like door openings within working time and only very few influences at<br />
night. For a scenario like that, XiltriX offers the possibility to set time dependent<br />
limits. Therefore, one of five evaluation functions can be chosen and configured.<br />
They are called:<br />
1. Permanent measuring value<br />
2. Day cycle<br />
3. Multiple day cycle<br />
4. Permanent measuring value with on/off recognition<br />
5. Multiple day cycle with on/off recognition<br />
The first evaluation function sets static limits and static delay times as explained in<br />
section 2.3. The second and third function offer the possibility to define time<br />
32
dependent limit and delay settings, so that it is possible to define limits with a higher<br />
span at daytime and limits with a lower span at nighttime. In addition to that, the<br />
multiple day function enables the person in charge to set different settings for<br />
different days of the week.<br />
Figure 3-8 exemplifies this possibility. The pictured configuration presents a limit<br />
setting of 2°C and 8°C within working time from Monday to Friday. During the<br />
residual time, limits of 2°C and 6°C are set. The activation delay defines the time that<br />
may elapse before the new limits have to be kept after switching.<br />
The last two offered evaluation functions combine the idea of a high/low switch and<br />
the idea of setting time dependent limits. Especially the last function offers the<br />
possibility to extend a configuration like the one below.<br />
Figure 3-8: XiltriX - Time Dependent Limit Settings [DEMO06]<br />
3.3.3 Alarm-, SMS- and E-Mail-Programs<br />
Another powerful additional feature of XiltriX is the vast quantity of notification options<br />
in case of a critical situation. Beside a message on the main screen, XiltriX offers the<br />
33
possibility to send several kinds of local and remote messages like SMS or E-Mail.<br />
Furthermore, additional local hardware like sirens or flashlights can be controlled. To<br />
enhance the value of these possibilities XiltriX offers alarm-, SMS- and E-Mailprograms<br />
to offer a comfortable way of configuration for every single monitored<br />
device.<br />
Alarm-programs enable the person in charge to define the alarming behavior of<br />
additional attached hardware. Up to eight different programs per department can be<br />
configured. Similar to the time dependent limit setting from section 3.3.2 an alarmprogram<br />
schedules different types of alarm relays. These relays have to be<br />
configured in advance. Figure 3-9 exemplifies a configuration for a locally installed<br />
flashlight.<br />
Figure 3-9: XiltriX - Setting up an Alarm Relay [DEMO06]<br />
Each configuration contains settings about the kind of alarm and one of the following<br />
three functions:<br />
1. On continuously<br />
2. On/off once<br />
3. On/off cyclically<br />
The first function activates the relay as long as an alarm is activated. An installed<br />
flashlight, for instance, would be turned on as long as the alarm is active. The second<br />
function also activates the corresponding relay as soon as a critical situation occurs,<br />
but it will deactivate the relay again after the expiration of a defined report duration<br />
time. This time can be defined between 1 and 99 seconds. Function number three is<br />
an extension to the just described one. It also deactivates the relay after the set<br />
report duration time, but activates it again after a set delay time as long as the alarm<br />
is reset. This delay time can range from 1 to 99 minutes.<br />
34
Aside from these functions, the kind of notified alarm can be influenced. As already<br />
explained in section 3.2, XiltriX is able to classify an alarm as technical or master<br />
alarm. A technical alarm indicates a malfunction in communication. A master alarm<br />
normally goes off, when there is no reaction to a prior alarm.<br />
Figure 3-9 introduces the configuration of an installed flashlight. As soon as an alarm,<br />
caused by the monitored device, goes off, the flashlight turns on for ten seconds,<br />
turns off for one minute and turns on again. In case of a technical or a master alarm 1<br />
or 2, the flashlight will also signalize the current situation as soon as the defined<br />
delay times between 5 and 15 minutes are exceeded.<br />
Beside the just introduced alarm-programs for local notification hardware, XiltriX also<br />
offers functionality to notify persons in charge by the mean of SMS or E-Mail.<br />
Therefore, additionally available modules have to be attached to the system.<br />
The notification via e-mail and SMS is quite simple. It is possible to configure a list<br />
with up to three responsible employees each. As soon as an alarm goes off, the first<br />
person will be notified. After a definable delay time, the second one will be notified in<br />
case of a still activated alarm. As long as no reset of an alarm is done, XiltriX will<br />
continue to notify these three people one after another. To allow an easy setup,<br />
XiltriX offers the possibility to create 16 configurations each. These configurations<br />
can be scheduled like the time dependent limits from section 3.3.2, so that the<br />
system always notifies the currently responsible employees.<br />
3.4 Review of XiltriX According to the Requirements Analysis<br />
The previous sections introduced basic and additional features as well as some<br />
technical basis of XiltriX. Especially the additional features intended to solve the<br />
existing methodological problems.<br />
Time dependent limits, for example, are able to reduce the number of alarms at<br />
daytime by choosing limits with a higher span. At night, the span can be reduced to<br />
achieve a lower probability of an error of second kind. 24 Furthermore, the system is<br />
not only able to alarm locally, but also by the use of E-Mail and SMS. Features like<br />
that are meant to improve the lack of information problem. 25<br />
24 See section 2.4 for details<br />
25 See section 2.4.1 for details<br />
35
But all these approaches are based either on the idea of adapting the limit settings or<br />
on the assumption that an immediate notification leads to a lower risk. In fact, XiltriX<br />
loses credibility due to a high quantity of false alarms. This leads to a higher risk<br />
because the probability of a not seriously taken alarm becomes very high.<br />
Table 3-3 contains the requirements that have to be kept by a sensor based<br />
temperature monitoring system 26 and XiltriX’s compliance. The current ability of<br />
XiltriX is limited to classify the current state by just evaluating actual temperature<br />
values. Furthermore, significant changes on the short-run may be detected, as they<br />
normally are bound to regular temperature exceeding. Hence, the first two<br />
requirements are partly fulfilled. All other requirements, including an avoidance of<br />
errors of second kind, cannot be satisfied.<br />
Table 3-3: Compliance of XiltriX According to the Requirements Analysis<br />
XiltriX is able to classify the current state of a monitored device<br />
XiltriX is able to recognize significant changes of behavior on the short-run<br />
XiltriX is able to recognize significant changes of behavior on the long-run<br />
XiltriX is able to predict upcoming failures<br />
XiltriX is able to identify failures as soon as they are recognizable<br />
XiltriX is able to avoid an error of second kind in any case<br />
XiltriX is able to recognize external influences<br />
This review demonstrates the need of finding better ways of data analysis within<br />
XiltriX to improve the current situation. The succeeding chapter 4 will point out the<br />
current state of research. Moreover, chapter 5 will present promising analyzing<br />
methods. But before, section 3.5 introduces briefly other major temperature<br />
monitoring products to point out other available approaches in the market.<br />
3.5 Other Major Monitoring Products in the Market<br />
As mentioned in section 2.3, three different types of sensor based temperature<br />
monitoring are current practice:<br />
26 See section 2.5 for details<br />
36
1. Temperature verification in retrospect<br />
2. Online comparison of current temperature values to a specified range<br />
3. Online comparison and data analysis in retrospect<br />
The following subsections will introduce major products, representing these types,<br />
and will review their compliance to the requirements. Moreover, differences to XiltriX<br />
will be pointed out.<br />
3.5.1 3M FreezeWatch and 3M MonitorMark Indicators<br />
The company 3M offers two very simple temperature monitoring solutions. They<br />
are called “3M FreezeWatch Indicator” and “3M MonitorMark Time Temperature<br />
Indicator” (pictured in Figure 3-10 and Figure 3-11). Both products are designed for<br />
short term temperature verification in retrospect, especially during shipment.<br />
Figure 3-10: 3M FreezeWatch<br />
Indicator [3M2006]<br />
Figure 3-11: 3M MonitorMark Time<br />
Temperature Indicator [3M2006]<br />
The FreezeWatch Indicator consists of an ampoule with an indication liquid inside.<br />
This ampoule is placed near the transported material during shipment. As soon as a<br />
critical temperature is reached, the indicator paper on the backside changes color<br />
forever. This enables the receiver to verify, whether a critical temperature was<br />
exceeded during shipment, or not. The FreezeWatch is available as a 0°C and as<br />
a -4°C version. [3M2006]<br />
The MonitorMark Time Temperature Indicator is based on the same idea. Beside<br />
the indication of a temperature exceeding, it offers additional basic information of<br />
duration or maximum temperature. As soon as the critical temperature is reached,<br />
the indicator paper starts to turn blue from the left to the right, the higher the<br />
37
temperature the faster the movement. Response cards are used to analyze the time<br />
temperature relation for each indicator paper. These cards are unable to define the<br />
highest really occurred temperature and the accurate duration, but offer a worst case<br />
scenario of the highest possible values. That is, because a long blue bar could be<br />
caused either by a lower critical temperature over a longer time period or by a high<br />
temperature over a short duration. [3M2006]<br />
The functionality of the just introduced products is very limited but easy to install<br />
during shipment. Due to missing alarming possibilities and missing online<br />
comparison of current temperature values, not a single requirement can be fulfilled<br />
by using these products. Hence, this approach is not suitable for lab equipment<br />
monitoring as the following Table 3-4 shows.<br />
Table 3-4: Compliance of 3M Indicators According to the Requirements Analysis<br />
Product is able to classify the current state of a monitored device<br />
Product is able to recognize significant changes of behavior on the short-run<br />
Product is able to recognize significant changes of behavior on the long-run<br />
Product is able to predict upcoming failures<br />
Product is able to identify failures as soon as they are recognizable<br />
Product is able to avoid an error of second kind in any case<br />
Product is able to recognize external influences<br />
3.5.2 2DI ThermaViewer<br />
The company 2DI offers the “ThermaViewer”. This instrument is intended to monitor<br />
single devices without the need of a PC. It is equipped with two sensors to measure<br />
temperature and humidity level. A Computer is not necessary because the<br />
ThermaViewer is capable of storing 44000 measuring points itself. The stored data<br />
can be displayed on its own display as pictured in Figure 3-12. [2DI2006]<br />
Figure 3-12: 2DI<br />
ThermaViewer [2DI2006]<br />
Beside basic display operations like zooming and<br />
scrolling, the device offers an interface to copy stored<br />
data to a Computer for archival purposes. Moreover,<br />
critical minimum and maximum temperature limits can<br />
be set. Similar to XiltriX, this device is able to indicate<br />
38
alarms by the use of additionally attached equipment like sirens, flashlights or dialers.<br />
[2DI2006]<br />
The ThermaViewer is a quite powerful solution for temperature monitoring purposes<br />
within small laboratories. But as it uses again the approach of setting critical<br />
temperature limits, it is faced with the same methodological problems like XiltriX. Due<br />
to the same approach and a similar implementation, the ThermaViever complies the<br />
same way with the requirements analysis than Xiltrix. This is illustrated in Table 3-5.<br />
Table 3-5: Compliance of the 2DI ThermaViewer According to the Requirements Analysis<br />
Product is able to classify the current state of a monitored device<br />
Product is able to recognize significant changes of behavior on the short-run<br />
Product is able to recognize significant changes of behavior on the long-run<br />
Product is able to predict upcoming failures<br />
Product is able to identify failures as soon as they are recognizable<br />
Product is able to avoid an error of second kind in any case<br />
Product is able to recognize external influences<br />
An additional problem of the ThermaViewer is the limited decentralized data storage,<br />
which disables extended data analysis on the long run, because only the last 44000<br />
measuring points are available (that is about 15 months, if a measuring point is<br />
stored every 15 minutes). But as this is not a methodological problem, but only a<br />
question of available memory, this problem can be neglected.<br />
3.5.3 Systems Offering Data Analysis in Retrospect<br />
Up to now, the introduced products within section 3.5 were kept relatively simple. In<br />
fact, most available systems in the market are kept as simple as the just introduced<br />
ones. The just mentioned ThermaViewer, for instance, is already one of the few<br />
higher developed systems, because it is able to store and display historical data.<br />
Most other monitoring products like the “Temperature Alarm System” from Triple Red<br />
just alarm in case of a temperature exceeding without offering additional information<br />
[Triple06].<br />
Only very few centralized temperature monitoring systems like XiltriX do exist in the<br />
market because most laboratories are still not aware of the danger of malfunctioning<br />
39
cooling devices and do not install expensive monitoring products like that. 27 Beside<br />
XiltriX, there are the following major systems in the market:<br />
1. Labguard 2 (AES Chemunex)<br />
2. FlashLink Wireless System (DeltaTRAK)<br />
3. Centron Environmental Monitoring System (Rees Scientific)<br />
The basic approach of Labguard 2 and the FlashLink Wireless System is very similar<br />
to XiltriX. Sensors are attached to cooling devices to get information of the current<br />
state and to alarm in case of an exceeding of predefined critical temperatures.<br />
Furthermore, the collected data is stored on a centralized web server and can be<br />
accessed from every connected client computer. [AES06], [DeltaTRAK06]<br />
The sensors of Labguard 2 and the FlashLink Wireless System do not need a<br />
substation or other wiring. They communicate directly with the web server by the<br />
exclusive use of radio signals. The FlashLink Wireless System is limited to 100<br />
sensors, which could monitor temperature or humidity. Labguard 2 is able to<br />
communicate with more different kinds of sensors, so that the system is also able to<br />
monitor pressure, CO 2 , O 2 and other substances. [AES06], [DeltaTRAK06]<br />
Also the bundled software does not differ much from each other. The FlashLink<br />
software offers basic functionality. It has to be installed on every machine that should<br />
gain access to the data. Besides setting critical alarm limits, it is possible to plot<br />
graphs from historical data and to alert a person in charge by the use of E-Mail or<br />
recorded voice calls. The Labguard 2 is very similar but it does only support remote<br />
notifications via E-Mail. On the other hand, it offers extended possibilities to display<br />
graphs including zooming, scrolling and a data export to Microsoft Excel. [AES06],<br />
[DeltaTRAK06]<br />
Compared to each other, XiltriX offers the best of the three presented software<br />
programs because it does not need a local installation on every client computer and<br />
offers additional features like the time dependent limit settings. But the general<br />
approach of just defining critical temperature limits remains the same. Hence, the<br />
review of the two just presented monitoring solutions and XiltriX according to the<br />
requirement analysis are alike. 28<br />
27 See section 2.1 for details<br />
28 See section 3.4 for details<br />
40
The third mentioned Centron Environmental Monitoring System is a building<br />
management system (BMS), which is designed to monitor the infrastructure of a<br />
whole building. Centron can be used to control access to labs or other important<br />
rooms. In addition to that, the system is able to save energy by turning off the lighting<br />
in empty rooms for instance. [Rees06]<br />
More important for this diploma thesis is the ability to monitor the temperature of<br />
cooling devices. Again, sensors are attached to freezers and the signal is transmitted<br />
to a web server. The possibilities to display and analyze historical data are very<br />
similar to XiltriX. Also the used technology is similar because historical data can also<br />
be accessed by a simple web browser. [Rees06]<br />
Figure 3-13 exemplifies the ability to display two graphs with different temperature<br />
scales. Drawing a graph like that is impossible with XiltriX because it is only capable<br />
of using one scale per axis. Another advantage of Centron is the possibility to<br />
integrate floor plans, so that in case of an alarm not only the name of a device is<br />
displayed but also the location. As Centron is not only a temperature monitoring<br />
system but a building management system, it offers the possibility to use nearly all<br />
kinds of connected devices to send a remote notification in case of a fridge’s<br />
malfunction. [Rees06]<br />
But minor differences like that do not provide a better way of data analysis because<br />
the general approach of just defining critical temperature limits remains the same.<br />
That is why Centron is faced with the same problems like all other introduced<br />
products in the market. Hence, other approaches have to be found to improve the<br />
current situation. As already pointed out in section 2.4.1, additional sensors could<br />
improve the lack of information partly. But installing additional hardware is coupled to<br />
increasing expenses. That is why this diploma thesis focuses on data analysis of the<br />
already recorded data to gain additional status information of monitored devices.<br />
This chapter introduced major available monitoring products in the market. Some<br />
systems are kept simple. Other systems offer many additional features. But a<br />
detailed analysis of all these products discovered, that they are all based on the<br />
same insufficient idea to set critical temperature limits.<br />
Hence, other approaches have to be found that do not only use the current<br />
temperature but also the stored past time data to determine a cooling device’s<br />
condition. Therefore, the next chapter will review the current state of research within<br />
the setting of sensor based temperature monitoring and other similar settings.<br />
41
Figure 3-13: Centron - A Sample Graph with Multiple Scales [Rees06]<br />
42
4 Current State of Research<br />
As already mentioned within this diploma thesis, there seems to be no research<br />
activity within this particular setting of sensor based temperature monitoring of<br />
cooling devices at the moment. That is why the described approach from section 2.3<br />
still seems to be state of the art.<br />
Hence this chapter will focus on similar fields of activity. The setting of machinery<br />
condition monitoring and the setting of measurement data analysis seem to be<br />
promising. Therefore, current research activities within these fields will be introduced<br />
and tested on applicability.<br />
4.1 Current State within the Setting of Sensor Based Temperature<br />
Monitoring<br />
The only found article, describing exactly this setting was written by H. Bonekamp in<br />
1997 and called “Monitor to guard fridge temperature”. This article points out the<br />
importance of temperature monitoring of fridges, even at home. Especially gone off<br />
food, due to too high temperatures within the fridge could be avoided by the use of<br />
temperature monitoring. Bonekamp suggests and describes the installation of a small<br />
sensor based temperature monitoring device. It only consists of three LEDs to<br />
indicate a low, a correct or a high current temperature. [Bonekamp97]<br />
As this approach is also based on the idea of just setting critical temperature limits to<br />
classify the current condition of a cooling device, other approaches have to be found.<br />
Therefore, the following subsections will introduce briefly common approaches and<br />
current state of research from the related settings of machinery condition monitoring<br />
and measurement data analysis.<br />
4.2 Current State within the Setting of Machinery Condition<br />
Monitoring<br />
Condition monitoring of industrial machinery is done for many different reasons.<br />
Common ones are ([Kolerus95], p. 3):<br />
• To avoid damage to machinery, employees or the environment<br />
• To avoid unexpected breakdowns of machinery<br />
• To do condition based maintenance<br />
• Quality control<br />
43
To achieve these and other aimed goals, several approaches of different conception<br />
levels do exist. In general, a decision between four different levels is made<br />
([Kolerus95], p. 4):<br />
1. Surveillance<br />
2. Early recognition of failures<br />
3. Failure diagnosis<br />
4. Trend analysis<br />
Surveillance is the most basic goal. Its only task is to recognize a just occurred<br />
malfunction and to react in a predefined way (e.g. raise an alarm, shut down the<br />
machine, etc.). The early recognition of failures should not only detect already<br />
occurred malfunctions but also slightly occurring misbehavior to allow a reaction in<br />
advance of a total breakdown. The last two levels shall predict upcoming<br />
malfunctions before they actually occur. The failure diagnosis is based on analysis of<br />
sensor data. The trend analysis extends this diagnosis by predicting the actual time<br />
the malfunction will happen.<br />
Looking at the aimed goals and the different conception levels of machinery condition<br />
monitoring indicates the similarity to sensor based temperature monitoring.<br />
Especially the last two conception levels are comparable to the aimed requirements<br />
from section 2.5. Hence, approaches of machinery condition monitoring have to be<br />
tested on applicability.<br />
Probably the most common way of machinery condition monitoring in practice is the<br />
usage of vibration analysis. Its basic idea is to obtain information of the current<br />
condition of a machine by measuring the vibration level from important moving parts.<br />
The vibration changes over time due to friction. This measured vibration is compared<br />
to a rated value to classify the current condition of the monitored parts. Depending on<br />
the kind of machine, different VDI guidelines do exist that describe critical vibration<br />
values (e.g. VDI 2056, VDI 2059, etc.). ([Kolerus95], p. 8-19)<br />
This kind of condition monitoring is quite easy to implement by just attaching vibration<br />
sensors to important parts. But due to a single sensor usage, this approach is also<br />
faced with a lack of information and not able to recognize external influences. 29 To<br />
reduce the probability of externally influenced results, a filter can be applied to the<br />
measuring data to cut off untypical frequency ranges (e.g. [Kolerus95], p. 22-29). But<br />
29 See section 2.4 for details<br />
44
even this improvement is still faced with a lot of problems that are very similar to the<br />
described ones in section 2.4 (see e.g. [Pitter01], p. 63-68).<br />
Hence an aimed goal of current research is to sensor additional measures to improve<br />
this type of machinery condition monitoring. Therefore, the research and<br />
development of sensors is mainly based on two ideas:<br />
• To create multi sensors<br />
• To create “intelligent” sensors<br />
Multi sensors allow a monitoring of different measures at the same time at the same<br />
place ([Krallmann05], p. 50). Main advantages are a saving of costs, higher reliability<br />
and a fusion of different kinds of measurements at one place ([Pitter01], p. 76). 30<br />
Intelligent sensors are based on mechatronics. The main idea of this research activity<br />
is to combine the fields of electronics, mechatronics and information processing<br />
([Piiter01], p. 27). This combination allows an interaction between mechanical and<br />
electronic parts in form of a control cycle. The approach offers new possibilities of<br />
monitoring and data analysis. 31 But as this diploma thesis shall be based on currently<br />
existing data there will be no focus on this activity.<br />
Beside an improvement of sensors, current research focuses also on knowledge<br />
driven approaches. The main idea is to combine measured data with additional<br />
knowledge of the underlying process (e.g. [Tröltzsch06], p. 10). As this additional<br />
knowledge is specific for certain settings, only general ideas of research activities<br />
from one setting could be applied to another one.<br />
In general two knowledge driven approaches can be figured out:<br />
• Knowledge models<br />
• Artificial neural networks<br />
A knowledge model is the most specific approach. It contains information of the<br />
underlying process. Hence, current measurement values can be classified in a better<br />
way. Often, this kind of model is combined with vibration analysis to determine<br />
friction for instance. In this case, a knowledge model could contain additional<br />
30 See e.g. [Krallmann05], [Pitter01] for details<br />
31 See e.g. [Pitter01]<br />
45
information of typical frictional behavior (e.g. linear friction vs. non linear friction).<br />
([Sick00], p. 5-7)<br />
Besides these specific knowledge models, there are artificial neural networks. These<br />
networks adopt the general functioning of a human brain. This means that an artificial<br />
neural network has to be trained with sample or historical data in advance, so that it<br />
is able to acquire knowledge. After this training, such a network is able to judge<br />
situations as regular or irregular like a human brain. ([Hagen97], p. 5-6)<br />
As this approach is able to learn on its own due to training data or past behavior, it is<br />
much more flexible, than a predefined knowledge model. 32 Both, knowledge models<br />
and trained artificial neural networks can be used as a knowledge basis for expert<br />
systems, which have the task to decide in an automated way (e.g. [Krems94],<br />
[Heuer97]).<br />
4.3 Current State within the Setting of Measurement Data Analysis<br />
The last section 4.2 already introduced the current state of research within the setting<br />
of machinery condition monitoring, which was faced with similar requirements, like<br />
sensor based temperature monitoring. This section will now focus on settings, in<br />
which analysis of time dependent data is used to detect changes and to predict<br />
upcoming behavior. The main focus lies on a generalized approach from Frank<br />
Daßler, which promises an early <strong>prediction</strong> of upcoming malfunctions without<br />
additional knowledge of the underlying setting ([Daßler95], p. 8).<br />
4.3.1 Basic Approaches<br />
Basic approaches are based on statistical methods. Descriptive statistical measures<br />
are used to get an aggregated overview of a datasets characterization (e.g. mean). In<br />
addition to that, these measures ease a comparison of different datasets (or different<br />
parts of a dataset) ([Eckey02], p. 41).<br />
Within many settings, time series analysis is applied to measurement data. Its main<br />
task is to discover structures and irregularities within a time sequence. By detecting<br />
structures, the time series analysis is not only able to describe regular behavior but<br />
also to predict the near future. 33 ([Chatfield04] p. 73-105)<br />
32 A detailed description of the functioning of an artificial neural network will be given in section 5.9.2.<br />
33 See section 5.5 for details<br />
46
Another current approach is regression. The main idea is to find a function that<br />
describes the temperature sequence best. A found function could be used for<br />
description as well as for <strong>prediction</strong> of future behavior ([Gentle02], p. 301). 34 Beside<br />
these presented approaches, artificial neural networks are applied again to gain<br />
additional knowledge (e.g. [Hawibowo97], p. 21-45). 35<br />
The succeeding chapter 5 will introduce the identified methods within this chapter<br />
and will test them on applicability to the setting of sensor based temperature<br />
monitoring. Before that, the next section introduces an approach that promises to be<br />
generalized. Hence, it should be applicable to the current problem.<br />
4.3.2 A Generalized Approach<br />
As already mentioned in section 4.3, Frank Daßler presents an approach that should<br />
be able to predict future measurement values without any knowledge of the<br />
underlying setting. The Main idea is to combine several known approaches to a new<br />
one. ([Daßler95], p. 7-8)<br />
The presented approach is based on the idea to solve the problem of just setting<br />
critical limits. According to the author, these existing methods lead to three problems:<br />
([Daßler95], p. 19)<br />
1. Just setting critical limits leads to sudden changes of current state. As long as<br />
a value does not exceed the predefined range, the state is classified as OK.<br />
2. In the moment of exceeding, an immediate reaction is necessary to solve<br />
dangerous situations.<br />
3. Chosen limits with a lower span to reduce situations of immediate danger lead<br />
to a higher quantity of false alarms due to outliers.<br />
Figure 4-1 illustrates the general proceeding of the new approach, which shall be<br />
able to solve the just mentioned problems:<br />
34 See section 5.4 for details<br />
35 See section 5.9.2 for details<br />
47
Figure 4-1: General Overview of the Generalized Approach ([Daßler95], p. 22) (adapted)<br />
The approach starts like every analyzing method with collection and storage of<br />
measurement data over time. As this approach is meant to be general, no<br />
requirements or restrictions are defined for this activity. ([Daßler95], p. 23)<br />
The biggest problem of analyzing measurement data is the influence of outliers on<br />
calculated results. According to the author, these outliers are evoked by technical<br />
disturbances. The following list contains some major causes: (Daßler95] p. 52)<br />
• Short-term measurement connection failures or short-circuits<br />
• Short-term sensor failures<br />
• Unstable working voltage<br />
• …<br />
As outliers are able to falsify calculated results, the elimination of these outliers is the<br />
first proposed step of measurement data analysis. A big problem is the recognition of<br />
outliers because nearly every measurement data is faced with noise, which causes<br />
small perceptible deviations. To identify outliers, the distance between succeeding<br />
measurement values is determined and compared to each other. These distances<br />
only vary marginal in case of a constant noise. To be able to classify a measurement<br />
value as an outlier a threshold value has to be found. The suggestion is to use the<br />
two times averaged distance between the succeeding measurement values as<br />
described by Formula 4-1. (Daßler95] p. 54)<br />
48
S<br />
o<br />
n<br />
2<br />
= ∑ Yi<br />
−Yi<br />
n −1<br />
i=<br />
2<br />
−1<br />
with :<br />
S<br />
o<br />
= Outlier threshold value<br />
Y = Measurement values<br />
n = Number of measurement values<br />
Formula 4-1: Threshold Value to Determine Potential Outliers<br />
But an ignorance of every measurement value with a higher distance than S o would<br />
lead to a neglecting of trends and other changes in behavior. That is why the number<br />
of outliers in a row is counted. Every possible outlier is set to the current mean value<br />
as long as less than three values in a row are classified to be outliers. In case of<br />
three or more values in a row, no further elimination will take place. (Daßler95] p. 54-<br />
55)<br />
This approach is able to cut off single outliers. The only disadvantage is a delay of<br />
trend recognition that is pictured in Figure 4-2. The green points represent the<br />
measured values with an existing change in trend. The red points illustrate the delay<br />
of trend recognition, because the first two higher values are classified as outliers and<br />
set to mean value. (Daßler95] p. 55)<br />
Figure 4-2: A Delayed Trend Recognition Due to Removal of "Outliers"<br />
After eliminating outliers, the measurement data is stored to a ring memory. This kind<br />
of memory has a fixed size. As soon as no sufficient memory is available to add an<br />
additional measuring point, the oldest value is overridden. This organization is used<br />
to avoid a high influence by too old values. The size of this ring memory is not<br />
49
accurately predetermined. Suggested is a size of 100 to 150 values. (Daßler95] p.<br />
25)<br />
Figure 4-2 pictures that the ring memory module communicates with three<br />
succeeding ones. First step of analysis is the curve selection. The basic idea is to<br />
describe the stored measurement values within the ring memory by finding a<br />
mathematical function (called regression). The curve selection module determines<br />
this function by using the method of least squares. 36 The acquired function is used to<br />
predict upcoming values by the succeeding module. (Daßler95] p. 25-26)<br />
These predicted values are used to recognize changes in trend. As soon as a<br />
change is recognized, the ring memory is cleared. This is especially important<br />
because old values that were stored before the change, falsify the results. An<br />
identification of changes is done by comparing actually measured values to their<br />
corresponding <strong>prediction</strong>s. In case of exceeding a certain threshold (see below), a<br />
new trend is assumed. (Daßler95] p. 58-60)<br />
The biggest problem of a reliable <strong>prediction</strong> is the already mentioned noise because<br />
a high noise could lead to an assumption of a new trend, although the behavior stays<br />
the same. That is why the noise has to be determined. The first step is a calculation<br />
of an envelope. This envelope normally includes all peaks. To exclude potential high<br />
peaks from envelopes, the following algorithm is used: (Daßler95] p. 61)<br />
1. Select the first five measurement values<br />
2. Determine ( f b<br />
) as a line of best fit for these values<br />
3. Calculate the distances between f<br />
b<br />
and measurement values<br />
4. Calculate the mean distance above the line ( d<br />
a<br />
)<br />
5. Calculate the mean distance below the line ( d<br />
b<br />
)<br />
6. Assign d<br />
a<br />
and d<br />
b<br />
to the measurement point right in the middle,<br />
Assign<br />
f ( X<br />
max<br />
) + d to the maximum value ( X<br />
max<br />
)<br />
b<br />
a<br />
Assign<br />
f<br />
X )<br />
b<br />
( X<br />
min<br />
) − db<br />
to the minimum value (<br />
min<br />
7. If end is not reached, deselect the first selected value, add the next one and<br />
go to 2.<br />
36 Section 5.4 contains a detailed description of regression and the method of least squares<br />
50
This algorithm returns an upper and a lower boundary of an envelope. These<br />
boundaries can be used to determine the noise by using Formula 4-2. Similar to the<br />
detection of outliers, this noise can be used to identify changes in trend. If the<br />
distance between measured and predicted value is higher than three times the<br />
calculated noise, a significant change in trend must have taken place. (Daßler95] p.<br />
64)<br />
n<br />
1<br />
N = ∑(<br />
E<br />
n<br />
i=1<br />
a<br />
i<br />
− E<br />
b<br />
i<br />
)<br />
with<br />
N = Noise<br />
n = Quantity of values<br />
E<br />
E<br />
a<br />
b<br />
= Upper boundary of envelope<br />
= Lower boundary of envelope<br />
Formula 4-2: Calculation of Noise<br />
Beside the recognition of changes in trend, the determination of the <strong>prediction</strong>'s<br />
probability is another important part of this approach. Four factors are identified to<br />
influence the probability of a correct <strong>prediction</strong>: (Daßler95] p. 74)<br />
• The quantity of measurement values<br />
• The noise<br />
• The curve stability<br />
• The <strong>prediction</strong> stability<br />
A small quantity of measurement values leads to a lack of information, as introduced<br />
in section 2.4.1. Also the high noise complicates an accurate <strong>prediction</strong> as just<br />
introduced. The last two factors are new but easy to see. The curve stability specifies<br />
the duration of time the currently used describing function did not change. The<br />
<strong>prediction</strong> stability offers a percentage of correct <strong>prediction</strong>s, since the last change in<br />
trend. Curve stability and <strong>prediction</strong> stability can be determined by using Formula 4-3<br />
and Formula 4-4: (Daßler95] p. 75-76)<br />
51
S<br />
c<br />
c<br />
= × 100%<br />
n<br />
with<br />
S<br />
c<br />
= Curve stability<br />
c = Quantity of measurement values that did not change the predicted curve<br />
n = Quantity of measurement values after last change in trend<br />
Formula 4-3: Calculation of Curve Stability<br />
S<br />
p<br />
C<br />
p<br />
= × 100%<br />
n<br />
with<br />
S<br />
C<br />
p<br />
p<br />
= Stability of <strong>prediction</strong><br />
= Counter for correct <strong>prediction</strong>s<br />
n = Quantity of measurement values after last change in trend<br />
Formula 4-4: Calculation of Prediction Stability<br />
The four just introduced criteria are used to calculate the <strong>prediction</strong>’s probability. But<br />
as the single values influence each other, a simple multiplication is not sufficient to<br />
calculate the total probability of correct predicted values. Also other methods that are<br />
based on fix limits of acceptable and unacceptable values are faced again with the<br />
already mentioned sudden change of state. That is why fuzzy logic is used for<br />
calculation. (Daßler95] p. 66-67)<br />
The main idea of fuzzy logic is not only to allow the answers 1 and 0 (as yes and no),<br />
but also values in between. This allows a better implementation of linguistic terms<br />
like “rather yes”. Hence, the total probability can be determined more precisely by<br />
using the four above mentioned factors. The presented approach uses the center of<br />
gravity method to calculate the results. But as this method is not relevant within this<br />
diploma thesis, it will not be presented here. 37<br />
The last step of this presented approach is the verification of predefined conditions,<br />
as pictured in Figure 4-1. This module evaluates the following three criteria, which<br />
have to be predefined by a person in charge: (Daßler95] p. 30)<br />
37 For details on fuzzy logic, see (e.g. [Turunen99], chapter 3-4; [Kosko99], chapter 1); for details on<br />
the actual implementation of fuzzy logic, consult (Daßler95] p. 79-89).<br />
52
• Critical values<br />
• Pre-warning time<br />
• Prediction probability<br />
Hence the presented approach allows predefinitions like for instance: “If the<br />
temperature will reach 10°C within the next 5 minutes with a probability of 90 percent,<br />
then send a trigger signal.”<br />
4.4 Review of Current State of Research<br />
The current chapter introduced different approaches from similar monitoring settings.<br />
Section 4.1 denoted the missing research activity within the setting of sensor based<br />
temperature monitoring. Section 4.2 pointed out the similarity to machinery condition<br />
monitoring. Basic methods like a comparison of current behavior to rated values can<br />
be found in both approaches. In addition to that, the machinery condition monitoring<br />
also uses knowledge driven approaches like artificial neural networks, which are not<br />
available within the setting of sensor based temperature monitoring. Hence, a test on<br />
applicability of this general idea has to be made. 38<br />
Section 4.3 introduced approaches from the setting of measurement data analysis.<br />
Basic approaches like descriptive statistical measures, time series analysis and<br />
regression also have to be tested on applicability. 39<br />
The introduced generalized approach from section 4.3.2 promises to be applicable to<br />
all kinds of measurement data without any knowledge of the underlying setting.<br />
Therefore, this approach is now reviewed according to the requirements analysis<br />
from section 2.5.<br />
First step was the elimination of outliers. This was based on the assumption that<br />
outliers are evoked by technical disturbances, which have to be ignored. An<br />
appliance to sensor based temperature monitoring could lead to an ignorance of high<br />
temperature peaks that are actually caused by door openings. But even if a change<br />
in trend is recognized, a delay of at least two time intervals will be caused. Hence,<br />
the approach is not able to identify upcoming failures as soon as they are<br />
recognizable.<br />
38 See chapter 5.9.3 for details<br />
39 See sections 5.3, 5.4 and 5.5 for details<br />
53
Another problem is caused by the ring memory. It was implemented to ignore old<br />
measurement values to avoid a high influence of these values. This allows<br />
recognition of significant changes of general behavior on the short-run but disables<br />
analysis on the long-run.<br />
The succeeding steps curve selection, calculation of <strong>prediction</strong>, recognition of<br />
changing trends and the determination of the <strong>prediction</strong>’s probability are faced with<br />
two big problems, which are caused by a missing ability to recognize external<br />
influences. First of all, every door opening would lead to an assumption of a change<br />
in general behavior, although it should be ignored, if a general change should be<br />
identified. Moreover, this approach is also not capable of predicting upcoming<br />
failures, because door openings cannot be distinguished from real malfunctions.<br />
Faced with these problems, the verification module is not able to offer more reliable<br />
information of a device’s current state, than the introduced method from section 2.3.<br />
Table 4-1 summarizes the just made review.<br />
Table 4-1: Compliance of the Generalized Approach According to the Requirements Analysis<br />
Approach is able to classify the current state of a monitored device<br />
Approach is able to recognize significant changes of behavior on the shortrun<br />
Approach is able to recognize significant changes of behavior on the long-run<br />
Approach is able to predict upcoming failures<br />
Approach is able to identify failures as soon as they are recognizable<br />
Approach is able to avoid an error of second kind in any case<br />
Approach is able to recognize external influences<br />
The introduced generalized approach seems to be applicable within many monitoring<br />
settings that suit the author’s assumptions. But especially the ignorance of outliers<br />
and the very low probability of a real technical malfunction 40 lead to a nonapplicability<br />
of this approach to the setting of sensor based temperature monitoring in<br />
practice. By contrast, single ideas, like the usage of regression and the other above<br />
mentioned approaches will be described and tested on applicability in the following<br />
chapter.<br />
40 See section 2.2.5 for details<br />
54
5 Possible and Promising Ways of Data Analysis<br />
The first four chapters already introduced the existing methodological problems of<br />
sensor based temperature monitoring systems and the current state of research. The<br />
second chapter pointed out the existing problems of the temperature monitoring task<br />
and limitations of the current approach of just setting critical temperature limits. The<br />
biggest problem was the existing lack of information. 41 This disabled an analyst to<br />
identify real causes of temperature deviations. Moreover, a very low probability of<br />
real malfunctions leads to many false alarms. 42<br />
The introduction of currently available temperature monitoring products in the third<br />
chapter pointed out that no solution seems to be available that bases on an other<br />
approach. Only some workarounds like time dependent limit settings are offered to<br />
solve the existing problems partially. 43 In fact, no introduced product did fully comply<br />
with the requirements from section 2.5.<br />
In addition to that, the fourth chapter pointed out that there seems to be no research<br />
activity within this particular setting of sensor based temperature monitoring of<br />
cooling devices within medical laboratories. That is the reason why this chapter tries<br />
to find ways to gain additional information of monitored devices by the use of<br />
statistical analysis and data mining. Due to missing specialized methods, the current<br />
research begins with an analysis of basic statistical and data mining methods. Aside<br />
from that, other specialized methods from the fourth chapter are introduced and<br />
tested on applicability.<br />
5.1 The Six Possible Levels of Data Analysis<br />
To be able to categorize different approaches of data analysis, it is important to<br />
review its possible kinds. Data analysis can be divided into six different levels of<br />
detail. Depending on the demands of the underlying setting, data analysis ranges<br />
from highly abstract to very detailed. According to the chosen level, one of the<br />
following kinds of results is aimed: ([Berthold99], p. 171)<br />
41 See section 2.4.1 for details<br />
42 See section 2.2.5 for details<br />
43 See section 3.3.2 for details<br />
55
1. Descriptive models<br />
2. Numerical models<br />
3. Graphical models<br />
4. Statistical models<br />
5. Functional models<br />
6. Analytic models<br />
Descriptive models represent the most abstract level of data analysis. They describe<br />
circumstances just by the use of verbal phrasing. ([Berthold99], p. 171) The sentence<br />
“The warmer the room ambient temperature, the higher the electric power<br />
consumption of a freezer”, for instance, already composes a small descriptive model.<br />
Although this kind of model does not give precise information of magnitude, it offers<br />
enough knowledge within many situations.<br />
By contrast, a descriptive model is not capable of solving the existing problems within<br />
the setting of sensor based temperature monitoring, because it is too abstract. A<br />
statement like “a cooling device is malfunctioning in case of reaching an uncommon<br />
temperature without being influenced externally” describes the problem very<br />
accurately. But it does not mention methods how to recognize these influences.<br />
Numerical models offer a more detailed description in form of a table. ([Berthold99],<br />
p. 171) Relating to the exemplified descriptive room ambient temperature model, an<br />
associated numerical model lists concrete room ambient temperatures versus the<br />
corresponding electric power consumption. The third level of data analysis is just the<br />
graphical representation of a numerical model. This is especially useful to get an<br />
overview of large datasets within very short time. 44<br />
The current methodological approach of predefining critical limits to classify the<br />
current state of a monitored system is a numerical model, because every<br />
temperature value is assigned to either “cooling device is OK” or “cooling device is<br />
malfunctioning” The graphical abilities of the introduced products, suffice to comply<br />
also with the third level of data analysis.<br />
The last three levels of data analysis are not based on verbal phrases or sample<br />
values but on mathematical “descriptions” for all values to get a higher detail.<br />
Statistical models use measures like, for instance, the mean temperature or the<br />
standard deviation to illustrate coherences ([Berthold99], p. 172). A sample model<br />
44 See section 3.2.1 for details<br />
56
could be: “The mean increase of a freezer’s electric power consumption is about 5%<br />
per degree room ambient temperature”.<br />
Functional models use functions to describe the existing behavior. ([Berthold99], p.<br />
172) Finding a functional description can be very difficult and is not always possible.<br />
A functional model of the freezer’s electric power consumption would allow a<br />
calculation for every given room temperature. Furthermore, it could help predicting<br />
malfunctions of a cooling device by just comparing current behavior to the describing<br />
function.<br />
Most powerful and detailed are analytic models. They describe coherences by the<br />
use of algebraic or differential equations. This allows a very detailed description of<br />
outputs for all kinds of imaginable inputs. ([Berthold99], p. 172) As already pointed<br />
out in section 2.4.1, many in- and outputs and their coherences are unknown due to<br />
very few sensors. That is why it seems to be impossible to find analytic models with<br />
the currently available datasets.<br />
Hence, this diploma thesis will first of all focus on possibilities to create statistical and<br />
functional models to gain more detailed information of monitored cooling devices.<br />
Only in case of reaching a degree of total information with this kind of models, an<br />
attempt to determine an analytic model would be useful.<br />
5.2 Different Kinds of Statistical Analysis<br />
The general purpose of statistical analysis is to provide information to advance<br />
important decisions. The main idea is to improve the quality of the decision making<br />
process by reducing uncertainties as good as possible. In general, statistics is<br />
divided into two branches: ([Holland01], p. 3)<br />
1. Descriptive statistics<br />
2. Inferential statistics<br />
Descriptive statistical methods describe large available datasets. Their main purpose<br />
is to summarize and to evaluate them. Another important task is the filtering of most<br />
important facts to get an overview of the underlying dataset. Typical results of<br />
descriptive statistical methods are statistical measures like the mean or the standard<br />
deviation for instance. The results are presented in form of a table or a graph to offer<br />
a quick overview. ([Holland01], p. 3)<br />
57
Inferential statistical methods do not describe available datasets but try to gain<br />
additional information from the existing data. These methods are applied to<br />
problems, where datasets cannot be obtained entirely ([Scharnbacher04], p. 43).<br />
After obtaining parts of the totality, inferential statistical methods are used to<br />
generalize gained results ([Bourier03], p. 3).<br />
Due to the fact that modern computer systems and mainframes are able to compute<br />
large amounts of data within very short time, statistical analysis offers additional<br />
calculation possibilities (e.g. data mining). Furthermore, the ability to collect data in<br />
an automated way increases the data base. As a result, the probability is higher that<br />
the gained generalized results are correct ([Eckey02], p. 3).<br />
5.3 Basic Descriptive Statistical Measures<br />
Section 1.2 defined the two main goals of this diploma thesis. The first one was to<br />
gain additional knowledge of the cooling device’s condition from recorded datasets to<br />
offer additional decision support in case of an exceptional temperature level.<br />
Therefore, a summarization and evaluation of available datasets by using descriptive<br />
statistics appears to be a promising approach. Hence, the succeeding subsections<br />
will introduce as well basic as special descriptive statistical measures from other<br />
already introduced monitoring settings. 45 Moreover, the expected gain of information<br />
is evaluated.<br />
Descriptive statistics offer some very common measures that can be applied easily to<br />
all kinds of numerical data. Most known are: ([Holland01], chapter 4)<br />
• Minimum and Maximum<br />
• The Mode<br />
• The Median<br />
• The Mean<br />
• The Standard Deviation<br />
The first basic descriptive statistical measurements are the minimum and the<br />
maximum value. Their determination can be done with very few time and effort.<br />
Nevertheless, these values can already indicate uncommon behavior. 46<br />
45 See chapter 4 for details<br />
46 See section 6.2.1 for details<br />
58
The mode is the most frequent value of a dataset. It can be regarded as a kind of<br />
center of a sorted dataset ([Eckey02], p. 42). Therefore, the mode is suitable to get a<br />
quick overview of the main behavior of large datasets. A disadvantage of the mode is<br />
the ignorance of outliers.<br />
The median is a value that divides a dataset into two parts of same extend. To<br />
calculate the median the single values of the given dataset have to be sorted by size<br />
to a row, so that X<br />
( 1)<br />
≤ X<br />
(2)<br />
≤K ≤ X<br />
( n)<br />
is complied. Afterwards, the value right in the<br />
middle is the median. In case of an even number of values, the mean of the two mid<br />
values is taken as described by the following Formula 5-1: ([Eckey02], p. 44)<br />
X<br />
⎧X<br />
(( n+<br />
⎪<br />
= ⎨1<br />
⎪ ( X<br />
⎩2<br />
1) / 2)<br />
( n / 2)<br />
+ X<br />
( n / 2+<br />
1)<br />
)<br />
if n odd<br />
if n even<br />
Formula 5-1: The Median Formula<br />
In case of a normal distribution, mode and median are very similar. This behavior<br />
changes, if the most frequent temperature value tends to one of the interval borders.<br />
Moreover, the median also ignores outliers.<br />
Probably the most common value in statistics is the mean. In fact, several different<br />
types of mean values do exist. Talking about the mean generally just denotes the<br />
arithmetic mean. It is calculated by just summarizing all values from a dataset and a<br />
subsequent division by the dataset’s quantity. Formula 5-2 describes this procedure<br />
in mathematical form. In contrast to mode and median, the arithmetic mean does not<br />
ignore outliers but weights every single value the same. ([Bourier03], p. 79)<br />
X<br />
1<br />
=<br />
n<br />
n<br />
∑ X i<br />
i=<br />
1<br />
Formula 5-2: The Arithmetic Mean Formula<br />
As already mentioned, there are several different mean values. Beside the already<br />
introduced arithmetic mean, there are the geometric and the harmonic mean. The<br />
geometric mean is especially used to analyze growth rates ([Bourier03], p. 84). The<br />
harmonic mean is defined to provide mean values of ratios ([Eckey02], p. 54). As<br />
monitoring data within the setting of sensor based temperature monitoring is neither<br />
faced with growth rates nor ratios these methods will not be presented.<br />
59
Another group of mean values are the weighted and moving ones. A weighted<br />
arithmetic mean, for instance, can be used to calculate a correct mean temperature,<br />
if the underlying dataset contains different time ranges. It is also possible to assign a<br />
higher importance to newer values (e.g. current outliers). The moving arithmetic<br />
mean always calculates a mean value by using the same number of values.<br />
Typically, the newest values are taken. As long as monitoring data is saved within<br />
constant time ranges, this method allows a calculation of mean values for a defined<br />
time span, e.g. the last three hours. Furthermore, it is also possible to add weighting<br />
to this kind of mean.<br />
Up to now, the presented statistical values just analyzed an average behavior. It was<br />
not possible to get further information of outliers. Therefore, the standard deviation is<br />
needed. It describes the mean variation of data values and is calculated by using the<br />
Formula 5-3. ([Eckey02], p. 71)<br />
σ =<br />
∑<br />
(<br />
X i<br />
n<br />
− X )<br />
2<br />
Formula 5-3: The Standard Deviation Formula<br />
This measure offers quite a lot of information in combination with the arithmetic<br />
mean. A low standard deviation indicates only slight changes around the mean value.<br />
By contrast, a high standard deviation indicates greater changes.<br />
Section 5.10.1 describes a promising approach, how these just presented basic<br />
statistical measures can be used to improve the current situation of insufficient<br />
information.<br />
5.4 Regression<br />
The general idea of regression is to describe a dataset of value pairs (x, y) by a<br />
functional model, as described in section 5.1 ([Gentle02], p. 301). Looking at time<br />
series data, regression tries to determine a functional model that describes the<br />
change of a value y over time x. Figure 5-1 pictures two examples of regression.<br />
60
Figure 5-1: Two Samples of Regression ([Bourier03], p. 167) (adapted)<br />
5.4.1 The Determination of Regression Functions<br />
A common approach to determine such a regression function is the method of least<br />
squares. This method is divided into three steps: ([Bourier03], p, 167)<br />
1. The determination of general trend from a graphical visualization or knowledge<br />
2. The assignment of this general trend to a mathematical type of function<br />
3. The numerical determination of the function’s parameters<br />
The first two steps are normally trivial and have to be done as initialization part. The<br />
third step has to determine the function’s parameters the way, the function describes<br />
the developing of values best. To do that, the distance between determined<br />
regression function and all available values has to be minimal, which leads to the<br />
method of least squares in Formula 5-4. The square is necessary to avoid illegal<br />
results. 47<br />
Min<br />
n<br />
∑<br />
i=<br />
1<br />
( y − yˆ<br />
)<br />
i<br />
i<br />
2<br />
with<br />
y = Occured value at time t = i<br />
i<br />
yˆ<br />
= Value of regression function at time t = i<br />
i<br />
Formula 5-4: Method of Least Squares<br />
Based on this method, a regression function can be determined for a given type of<br />
function. Often applied types are: ([Daßler95], p. 43)<br />
47 See ([Bourier03], p. 168-169) for details<br />
61
1. y ˆ = ax + b (Linear function)<br />
2.<br />
3.<br />
4.<br />
b<br />
y ˆ = ax<br />
(Exponential function)<br />
bx<br />
y ˆ = ae<br />
(Euler function)<br />
a<br />
yˆ =<br />
(Hyperbola)<br />
x + b<br />
5. y ˆ = aln(<br />
x)<br />
+ b (Logarithmic function)<br />
The assumption of a linear trend leads to the usage of y ˆ = ax + b as regression<br />
function. An appliance to the method of least squares leads to Formula 5-5:<br />
Min<br />
n<br />
∑<br />
i=<br />
1<br />
( y − b −<br />
i<br />
ax i<br />
)<br />
2<br />
Formula 5-5: Method of Least Squares for an Assumed Linear Trend<br />
To determine the parameters a and b it is necessary to partially differentiate Formula<br />
5-5 to these parameters. Afterwards, these equations have to be solved to a and b.<br />
Performing these two steps leads to the following optimal linear regression Formula<br />
5-6: ([Bourier03], p, 169-171)<br />
yˆ<br />
= ax + b<br />
with<br />
a =<br />
∑<br />
∑<br />
b = y − ax<br />
x y − nxy<br />
i i<br />
2<br />
xi<br />
− nx<br />
Formula 5-6: Regression Function for Describing Linear Trend<br />
2<br />
Other types of functions, like the mentioned one above, can also be used for<br />
regression purposes. The general idea stays the same, only the calculation steps<br />
vary with different functions. As other types of regression are not of interest within<br />
this diploma thesis, 48 they will not be regarded here. 49<br />
48 See section 5.10.2 for details<br />
49 More details can be found (e.g. [Eckey02], p. 171-184; [Bourier03], p. 172-179)<br />
62
5.4.2 The Major Problems of Regression<br />
Up to now, this section just introduced the approach of regression. The last part of<br />
this section will now review its two major problems: ([Eckey02], p. 179)<br />
• An incorrect chosen type of function leads to unacceptable results<br />
• Significant outliers influence the determination of a regression function<br />
The first problem can be solved partly by trying several types of functions.<br />
Afterwards, the best result can be selected. This is especially useful in cases of<br />
automated regression, where the general type of function may change. Problematic<br />
is the appliance of regression to purely random data, because a selection of a certain<br />
type of function might be impossible.<br />
The second problem could even be worse, because a correct type of function might<br />
lead to significant incorrect results, due to an influence of outliers. The two following<br />
figures exemplify this. Both, Figure 5-2 and Figure 5-3 contain a linear trend. But the<br />
obtained regression function for the first dataset is significantly wrong due to a single<br />
outlier.<br />
Figure 5-2: Incorrect Regression Function due to an Outlier ([Eckey02], p. 180) (adapted)<br />
Figure 5-3: Correct Regression Function ([Eckey02], p.180) (adapted)<br />
63
A graphical form like this allows an easy validation of the obtained regression<br />
function. But there is also a mathematical measure that offers a quality factor. It is<br />
called coefficient of determination and defined by Formula 5-7. The general idea is to<br />
2<br />
2<br />
split the total variance ( Var<br />
y<br />
) into the variance caused by regression (<br />
ŷ<br />
)<br />
2<br />
residual variance (<br />
u<br />
)<br />
Var and the<br />
Var . 50 ( )<br />
R<br />
2<br />
Var<br />
=<br />
Var<br />
2<br />
yˆ<br />
2<br />
y<br />
with<br />
Var<br />
Var<br />
Var<br />
2<br />
y<br />
2<br />
yˆ<br />
2<br />
u<br />
= Var<br />
1<br />
=<br />
n<br />
1<br />
=<br />
n<br />
2<br />
yˆ<br />
n<br />
∑<br />
i=<br />
1<br />
n<br />
∑<br />
i=<br />
1<br />
+ Var<br />
u<br />
i<br />
2<br />
i<br />
2<br />
u<br />
yˆ<br />
− y<br />
2<br />
Formula 5-7: Coefficient of Determination<br />
The coefficient of determination is a value between 0 and 1. If regression does not<br />
offer any additional information than the mean value, no variance is caused by<br />
regression. This leads to a coefficient of 0. By contrast, a coefficient of 1 would be<br />
caused by a regression variance that is of same magnitude like total variation.<br />
Hence, every occurred value is actually part of the determined regression function.<br />
([Eckey02], p. 181)<br />
In practice, a coefficient of determination of at least 0.8 is demanded. In case of time<br />
sequences, an even higher coefficient like 0.9 is demanded ([Eckey02], p. 181). A<br />
regression function with such a high coefficient can be used for <strong>prediction</strong> purposes<br />
by just calculating regression values for the near future.<br />
Section 5.10.2 will introduce a promising appliance of regression to determine a<br />
trend, which indicates a change in behavior. In contrast to that, <strong>prediction</strong> of an<br />
upcoming malfunction is not possible by using regressing, because every significant<br />
temperature rising would be predicted as upcoming malfunction. As most risings are<br />
caused by external influences and not by changes of general behavior, a usage of<br />
regression for <strong>prediction</strong> purposes would lead to at least the same high quantity of<br />
false alarms.<br />
50 See ([Eckey02], p. 180-181) for details<br />
64
The next section will introduce time series analysis. In contrast to regression, it is<br />
only limited to time series data, but offers more analyzing possibilities.<br />
5.5 Time Series Analysis<br />
“A time series is a collection of observations made sequentially through time”<br />
([Chatfield04], p. 1). The major idea of time series analysis is to decompose the<br />
variation of a time series graph into the four following components to obtain a<br />
structure: ([Chatfield04], p. 12)<br />
1. Trend (t)<br />
2. Seasonal variation (s)<br />
3. Other cyclic variation (o)<br />
4. Other irregular fluctuations (i)<br />
In some cases, trend and other cyclic variations are combined, so that only three<br />
components do exist (e.g. [Bourier03], p. 158). In the following, this diploma thesis<br />
will focus on the more common decomposition into four components.<br />
The first component could be defined as “long-term change in the mean level”<br />
([Chatfield04], p. 12). The greatest problem is the definition of “long-term”. Depending<br />
on the setting days could be meant as well as decades. The seasonal variation offers<br />
information about predictable recurring behavior (e.g. buying behavior at wintertime<br />
vs. buying behavior at summertime).<br />
Other cyclic variations are predictable as well but cover a smaller time span than the<br />
seasonal variations. For instance, buying behavior at daytime is higher than at<br />
nighttime. This could be described by cyclic variation. Behavior that cannot be<br />
explained with one of the just mentioned components has to be classified as other<br />
irregular fluctuations. These irregular fluctuations have to be kept small to get an<br />
expressive decomposition of a dataset’s variation. ([Chatfield04], p. 12)<br />
Figure 5-4 exemplifies a marketing time series. The seasonal variation is easy to see,<br />
because sales reach a maximum every winter and a minimum every summer.<br />
Moreover, a trend is recognizable, because every summer, a higher maximum and<br />
every winter a higher minimum is reached. After falling down in December, there is<br />
another small peak in January in most years. This could be classified as cyclic<br />
variation.<br />
65
Figure 5-4: Sales of an Industrial Heater [Chatfield04]<br />
A decomposition of a time series y = t , s , o , i ) like that allows a nearly complete<br />
t<br />
(<br />
t t t t<br />
description. Deviations are very small and have to be classified as other irregular<br />
fluctuations. A <strong>prediction</strong>, based on such a time series, leads to much better results<br />
than regression because the regular variations t, s, and o are taken into account.<br />
To be able to identify these components it is first of all necessary to define the<br />
interaction of the single components. In general, two models do exist. Formula 5-8<br />
pictures the additive one and Formula 5-9 pictures the multiplicative one.<br />
([Bourier03], p. 158-159)<br />
y = t + s + o + i<br />
t<br />
t<br />
Formula 5-8: The Additive Component Model<br />
t<br />
t<br />
t<br />
y<br />
t<br />
= t ⋅ s ⋅o<br />
⋅i<br />
t<br />
t<br />
t<br />
t<br />
Formula 5-9: The Multiplicative Component Model<br />
The first model is normally used, if cyclic variations with constant amplitude are<br />
assumed. By contrast, the multiplicative model is used, if cyclic components are on<br />
the increase over time. ([Eckey02], p. 188)<br />
The first step of a time series analysis is the identification of a possible trend. A very<br />
common approach to identify a trend is the usage of the least squares method, as<br />
66
explained in section 5.4. If such a trend is available, it can be removed from the<br />
existing data, so that the residual components can be determined. A time series<br />
without a trend is called stationary ([Chatfield04], p. 13). In fact, most methods<br />
require stationary time series data.<br />
After obtaining the trend, seasonal and other cyclic variations can be determined.<br />
This is done, for instance, by the use of the periodogram method. The general idea is<br />
to determine the distances to the trend function and to discover regular patterns. 51<br />
But as already mentioned in section 5.4.2, a <strong>prediction</strong> of an upcoming malfunction is<br />
not possible because every significant rise in temperature would lead to such a<br />
<strong>prediction</strong>. Moreover, the very low probability of a real malfunction 52 in combination<br />
with randomly occurring external influences have to be classified as irregular<br />
variations. Faced with these significant irregular variations, time series analysis is not<br />
able to offer additional improvements, compared to regression.<br />
5.6 Failure- and Availability Ratios<br />
Common within the settings of quality assurance and condition monitoring are<br />
operating ratios that specify the availability of systems and their tendency to run into<br />
failure.<br />
Common ratios to define this behavior are the “mean time to failure” (abbr. MTTF),<br />
the “mean time between failures” (abbr. MTBF) and the “mean time to repair” (abbr.<br />
MTTR). The first two measures characterize the average time a unit is working<br />
correctly before breaking down. The only difference between MTTF and MTBF is that<br />
the first one is used for parts that cannot or should not be repaired, but replaced. The<br />
second one is used for giving the average time between two necessary repairs of<br />
high value parts. The MTTR characterizes the average time the repairing takes.<br />
([Masing88], p. 113)<br />
These ratios can be used to specify the availability of systems by using the Formula<br />
5-10. This availability allows probability calculations, whether a system can be used<br />
during a specified time.<br />
51 See (e.g. [Bourier03], p. 180-189) for details<br />
52 See section 2.2.5 for details<br />
67
MTBF<br />
Availability =<br />
MTBF + MTTR<br />
Formula 5-10: The Definition of Availability [Masing88]<br />
The general idea, to calculate the estimated availability during a specified time,<br />
seems to be promising. But this method of failure- and availability ratios is faced with<br />
a major problem, when applying it to the setting of sensor based temperature<br />
monitoring. Most manufacturers of cooling devices do not offer ratios like MTTF<br />
[Nijmegen06]. As cooling devices are long-life products, a determination of these<br />
measures is also impossible. Hence, the appliance of failure- and availability ratios is<br />
not applicable within the setting of sensor based temperature monitoring.<br />
5.7 Markov Chains<br />
Another approach of predicting breakdowns is the usage of Markov chains. These<br />
chains are simple time-discrete stochastic processes (<br />
n<br />
)<br />
n N0<br />
X ∈<br />
with a countable state<br />
space I that comply with the following Formula 5-11 for all points in time n∈ N<br />
0<br />
and<br />
all states<br />
i ,K,<br />
i , i i ∈ I : ([Waldmann04], p. 11)<br />
0 n−1<br />
n,<br />
n+<br />
1<br />
P X i | X = i , , X = i , X = i ) = P(<br />
X = i | X = i<br />
(<br />
n+ 1<br />
=<br />
n+<br />
1 0 0<br />
K<br />
n−1<br />
n−1<br />
n n<br />
n+<br />
1 n+<br />
1 n n<br />
Formula 5-11: The Markov Property<br />
)<br />
This Markov property is the specific characteristic of Markov chains. It says that the<br />
probability for changing to another state is only influenced by the last observed state<br />
and not by prior ones. Hence, the probability that X<br />
n+ 1<br />
takes the value i<br />
n+ 1<br />
is only<br />
influenced by<br />
i n<br />
∈ I and not by i ,K in ∈ I . ([Waldmann04], p. 11)<br />
0<br />
,<br />
−1<br />
The conditional probability P ( X<br />
n + 1<br />
= in<br />
+ 1<br />
| X<br />
n<br />
= in<br />
) is called the processes’ transition<br />
probability. If this transition probability is independent from the point in time n , the<br />
Markov chain is called homogeneous. Otherwise it is called inhomogeneous<br />
([Waldmann04], p. 11). In the following, this thesis will first of all focus on<br />
homogeneous Markov chains. To improve the readability, they will just be named<br />
Markov chains.<br />
In the majority of cases the transition probability is written as a matrix P . It contains<br />
the probabilities p<br />
ij<br />
of all possible changes between old state i and new state j as<br />
68
pictured in Formula 5-12. Beside a change in state it is also possible that the state<br />
remains the same for another time interval. This probability is given by<br />
each column. ([Beichelt97], p. 146)<br />
p<br />
ii<br />
within<br />
⎛ p<br />
⎜<br />
⎜ p<br />
P = ⎜<br />
⎜ M<br />
⎜<br />
⎝ p<br />
00<br />
10<br />
i0<br />
p<br />
p<br />
M<br />
p<br />
01<br />
11<br />
i1<br />
L p<br />
L p<br />
L M<br />
L p<br />
0 j<br />
1 j<br />
ij<br />
⎞<br />
⎟<br />
⎟<br />
⎟<br />
⎟<br />
⎟<br />
⎠<br />
Formula 5-12: Transition Probability Matrix<br />
As every<br />
p represents a probability, they all have to comply with 0 ≤ p ≤1.<br />
ij<br />
Moreover, every state must have a succeeding state. Hence, the probability to take<br />
one of the available countable states as the next one has to be one hundred percent.<br />
This leads to the conditions pictured in Formula 5-13 for the transition probability<br />
matrix. ([Jondral02], p. 186-187)<br />
ij<br />
0 ≤<br />
p<br />
ij<br />
≤1<br />
∀i,<br />
j<br />
and<br />
N<br />
∑<br />
j=<br />
1<br />
p<br />
ij<br />
= 1<br />
∀i<br />
Formula 5-13: Conditions for the Transition Probability Matrix<br />
As the sum of every row within that Matrix has to be 1, p = 0 entries can be left out<br />
to offer a better overview. To achieve an even better overview, Markov chains are<br />
often visualized as a graph. Every node of that graph represents a possible state and<br />
every arrow a possible transition with a positive probability. ([Waldmann04], p. 17)<br />
ij<br />
A Markov chain can be used, for instance, to describe the following gamble between<br />
two people: A coin is thrown. Depending on which side is faced up, one of the two<br />
players wins the coin. Player one starts with four coins, player two with two coins.<br />
The game ends as soon as one of the players owns all six coins. This leads to seven<br />
possible states because a player can own every number of coins between zero and<br />
six. Provided that every coin has the same winning probability p the transition<br />
probability matrix would look like the one pictured in Formula 5-14. As described<br />
above, this Markov chain can be visualized as a graph to allow a better overview of<br />
the described process. A comparison of Formula 5-14 and the corresponding Figure<br />
5-5 shows this improvement. 53<br />
53 Example taken from ([Waldmann04], Chapter 2)<br />
69
⎛ 1<br />
⎜<br />
⎜1−<br />
p<br />
⎜ 0<br />
P = ⎜<br />
⎜ 0<br />
⎜<br />
⎜<br />
0<br />
⎝ 0<br />
0<br />
0<br />
1−<br />
p<br />
0<br />
0<br />
0<br />
0<br />
p<br />
0<br />
1−<br />
p<br />
0<br />
0<br />
0<br />
0<br />
p<br />
0<br />
1−<br />
p<br />
0<br />
0<br />
0<br />
0<br />
p<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
p<br />
1<br />
⎞<br />
⎟<br />
⎟<br />
⎟<br />
⎟<br />
⎟<br />
⎟<br />
⎟<br />
⎠<br />
Formula 5-14: Sample Transition Probability Matrix<br />
Figure 5-5: Sample Transition Probability Graph<br />
Up to now, the transition probability matrix only made it possible to obtain the<br />
probability for a single change. But also important are probabilities of several<br />
changes in a row, as pictured in Formula 5-15. ([Beichelt97], p. 147)<br />
p<br />
( m)<br />
ij<br />
= P(<br />
X<br />
+<br />
= j | X = i)<br />
m = 1,2,...<br />
n<br />
m<br />
n<br />
Formula 5-15: Transition Probabilities of Several Changes in a Row<br />
(m)<br />
p<br />
ij<br />
symbolizes the probability that state i will change to state j after m steps.<br />
Apparently,<br />
p = is complied. The calculation of m > 1 can be done by using the<br />
(1)<br />
ij<br />
p ij<br />
formula of Chapman-Kolmogorov, which is pictured in Formula 5-16. ([Beichelt97], p.<br />
147)<br />
m<br />
( r ) ( m−r<br />
)<br />
pij = ∑ pik<br />
pkj<br />
r = 1,2, K , m −1<br />
k∈I<br />
Formula 5-16: Formula of Chapman-Kolmogorov<br />
Using the knowledge that one state of the countable state space has to be taken<br />
after r steps in combination with the knowledge of the total probability and the<br />
Markov property leads to an easy argumentation. 54 As a result the transition<br />
probability matrix for r steps is determined by multiplying r times the matrix by itself.<br />
This enables a simplified version of Formula 5-16. ([Beichelt97], p. 148)<br />
54 See ([Beichelt97], p. 147) for details<br />
70
P<br />
( m)<br />
= P<br />
( r )<br />
⋅ P<br />
( m−r<br />
)<br />
with<br />
P<br />
( m)<br />
=<br />
( m)<br />
( p ) m = 1,2, K<br />
ij<br />
Formula 5-17: Formula of Chapman-Kolmogorov (Simplified Version)<br />
This simplified Formula 5-17 allows an argumentation that shows that every Markov<br />
chain can be described completely by just giving a starting distribution at step 0 and<br />
a transition matrix. 55<br />
As mentioned above, a Markov chain is often used to predict breakdowns. Therefore,<br />
the existing states have to be classified as critical and uncritical ones. In general,<br />
states<br />
i ∈ I with p = 1 are critical ones. They are called absorbing states. Figure 5-5<br />
ii<br />
contains two absorbing states because after taking state 0 or 6 all following states<br />
will remain the same. Markov chains can now be used to determine the probability a<br />
critical state is taken. If an absorbing state is taken with a probability of one hundred<br />
percent, the mean number of steps can also be determined after which an absorbing<br />
state is taken. ([Waldmann04], p. 18)<br />
This determination can be done by calculating<br />
( )<br />
P m<br />
with m = 1,2, K,<br />
∞ . Markov chains<br />
often converge to a stationary distribution, so that the probability for an absorbing<br />
state can be given for an infinite number of state changes. Formula 5-18 introduces a<br />
counter-example that does not converge. Hence, the probability an absorbing state is<br />
taken during the whole processing time of the Markov chain cannot be obtained in<br />
any case but in many cases. ([Waldmann04], p. 40)<br />
⎛0<br />
P = ⎜<br />
⎝1<br />
1⎞<br />
⎟<br />
0⎠<br />
Formula 5-18: Identity Matrix as an Example of a non Converging Markov Chain<br />
The Markov property can also be transferred to the setting of time-continuous<br />
stochastic processes. The result is called Markov process. The biggest difference to<br />
the Markov chains is the non-applicability of the above described state probability<br />
calculations. The results can no longer be determined by just multiplying matrices but<br />
by solving differential equations. To ease calculations the underlying process is often<br />
55 See ([Beichelt97], p. 148) for details<br />
71
assumed to be asymptotic and only the stationary state is used for calculations. This<br />
leads to a linear system of equations that can be solved with less effort again. 56<br />
Both, Markov chains and Markov processes have become important analyzing<br />
methods within many different settings. As mentioned above, they can be used, for<br />
instance, to predict the time of first occurrence of a critical system state. Looking<br />
back to the setting of machinery condition monitoring from section 4.2 would allow to<br />
use Markov chains, for example, to predict upcoming malfunctions due to friction.<br />
([Waldmann04], p. 6-7)<br />
The Markov property seems to be promising also within the setting of sensor based<br />
temperature monitoring because a cooling device may malfunction at any time, no<br />
matter how long it worked fine before. But as already mentioned in section 2.2.5, a<br />
real technical malfunction has a very low unknown probability. Hence, starting<br />
distribution and transition matrix cannot be determined.<br />
5.8 Inferential Statistics<br />
In contrast to descriptive statistics, inferential approaches do not describe available<br />
datasets but try to generalize gained knowledge from existing data. These methods<br />
are applied to problems where data cannot be obtained entirely. The general idea is<br />
to analyze a representative sample of the statistical universe. But only in case of a<br />
really representative sample, the gained results can be applied correctly to the whole<br />
statistical universe. ([Eckey02], p. 242)<br />
The generalization of gained information is always bound to probability calculation.<br />
Hence, the general approach of inferential statistics is to determine the distribution of<br />
a representative sample. Afterwards, this distribution can be used to perform interval<br />
estimations, hypothesis testing or other similar methods. 57<br />
An application to sensor based temperature monitoring would require such a<br />
representative sample to determine the distribution. But in fact, such a representative<br />
sample does not exist, because of the randomness of external influences. This<br />
problem could partly be solved by applying monitoring data of a longer time period as<br />
representative sample to calculate the distribution.<br />
56 See ([Waldmann04], Chapter 4) for details<br />
57 See (e.g. [Scharnbacher04]) for details<br />
72
But the greatest problem is again the probability calculation, because a calculated<br />
low probability of a short-term malfunction could lead again to the assumption that<br />
the corresponding cooling device will not break down. 58<br />
5.9 Data Mining<br />
Data mining represents a special way of statistical data analysis. Its main purpose is<br />
to determine relationships between several items that were not recognized<br />
previously. Most often these relationships were not of primary interest at collection<br />
time. ([Gentle02], p. 123)<br />
To be able to apply data mining, many companies collect as much data as possible<br />
nowadays. In former times, check-outs in supermarkets, for instance, just<br />
summarized unit prices to calculate the final amount. Modern check-outs log every<br />
single product as well as other available data. Moreover, credit cards or discount<br />
cards allow a customer’s identification. ([Martin98], p. 249-250)<br />
These collected datasets can be analyzed by the use of data mining methods to gain<br />
additional knowledge. An aimed goal could be, for example, adapted sales promotion<br />
for different kinds of customers. Furthermore, the determination of the customer’s<br />
buying behavior could be of interest. A possible result could be that eighty percent of<br />
customers that buy beer do also buy potato chips. ([Lusti02], p. 262)<br />
5.9.1 General Fields of Application<br />
The just quoted examples already introduced some very common approaches. In<br />
general, data mining is divided into five fields of application: ([Lusti02], p. 262)<br />
1. Text mining<br />
2. Association rule mining<br />
3. Prediction<br />
4. Clustering<br />
5. Classification<br />
Text mining is the most basic field of application. Its purpose is to find patterns in text<br />
files for information retrieval. Therefore, special search algorithms have to be<br />
58 See also section 5.6<br />
73
implemented. These implementations are characterized by the type of text input and<br />
the estimated output. 59<br />
A very popular example for text mining is the automated collection of e-mail- and<br />
postal addresses from internet pages. The text mining algorithm has to identify these<br />
mentioned addresses as well as links to other pages to be able to continue<br />
searching. But as this data mining approach can only be applied to textual data, this<br />
diploma thesis will not go into further detail.<br />
Association rule mining is a multi criteria approach. Its purpose is the explorative<br />
discovery of dependencies between several items. The association rule mining is<br />
based on statistical correlation analysis. But as this is a multi criteria approach, it<br />
needs at least two different measures as input. 60<br />
The above mentioned example of beer and potato chips is a typical assignment but<br />
an appliance to temperature monitoring data does not seem to be promising,<br />
especially because the only possible information gain is a correlation between door<br />
openings and temperature behavior, which is generally known already. Hence, this<br />
data mining approach will also be left out within this diploma thesis.<br />
Prediction methods like regression and time series analysis are already introduced.<br />
The setting of data mining offers an additional approach, the so called artificial neural<br />
networks. These networks will be introduced and reviewed in the succeeding<br />
subsections.<br />
Clustering is an approach that scans large datasets and tries to identify different<br />
kinds of groups, which are previously unknown. Clustering is often used as a first<br />
step to apply other data mining methods to the identified groups ([Martin98], p. 269).<br />
The example of adapted sales promotion could be achieved by the use of clustering.<br />
Therefore, groups are determined automatically, that divide the customer’s behavior<br />
best (e.g. a separation by special interest or buying behavior) ([Lusti02], p. 261).<br />
Classification is similar to clustering. The main difference is the already existing<br />
knowledge of the classes (e.g. “creditworthy” vs. “not creditworthy”). An easy but<br />
basic approach is the so called rule induction. New rules are either created by<br />
experts or by an analysis of historical data ([Gentle02], p. 237-238). Automated<br />
59 See (e.g. [Multhaupt00], chapter 3-4) for details<br />
60 See (e.g. [Wittenberg98], p. 161-165) for details<br />
74
clustering as well as an analysis of historical data is often done by the use of artificial<br />
neural networks ([Blasig95], p. 3-4).<br />
The succeeding subsections will introduce artificial neural networks and will review<br />
their applicability to sensor based temperature monitoring data. A positive review<br />
would allow the usage of automated clustering and classification.<br />
5.9.2 Artificial Neural Networks<br />
Artificial neural networks are based on the functioning of a human brain. Every brain<br />
consists of neurons. These neurons are stimulated from neighbored neurons by<br />
chemical impulses, the so called neurotransmitters. Neurons transform incoming<br />
chemical impulses to electrochemical signals and relay them to the neighbored<br />
neurons. A regular exchange of these signals between two neurons leads to a high<br />
activation of this connection. By contrast, sparse communication leads to a low<br />
activation or even a loss of connection. ([Martin98], p. 262-263)<br />
The underlying basic principal is learning from failures. A connection that represents<br />
an error is assigned to a low activation level after recognition. By contrast, generally<br />
valid facts are represented by a highly activated connection. ([Lusti02], p. 316)<br />
Artificial neural networks adopt this functioning. They are defined by a tuple (N, V, F).<br />
N is a set of neurons. A neuron n i is defined by Formula 5-19. V and F represent a<br />
set of directed connections between neurons and a set of learning functions<br />
respectively, which are defined by Formula 5-20. ([Hagen97], p.6-7)<br />
n<br />
i<br />
= ( x(<br />
t),<br />
w ( t),<br />
a ( t),<br />
f , g,<br />
h)<br />
i<br />
i<br />
with<br />
x(<br />
t)<br />
= ( x ( t),<br />
K,<br />
x<br />
w ( t)<br />
= ( w<br />
a ( t)<br />
∈ R as activation level at time t<br />
i<br />
i<br />
h : R<br />
n<br />
1<br />
× R<br />
i1<br />
n<br />
( t),<br />
K,<br />
w<br />
( t))<br />
∈ R<br />
→ R with s ( t)<br />
= h(<br />
x(<br />
t),<br />
w ( t))<br />
as propagation<br />
g : R×<br />
R → R with a ( t)<br />
= g(<br />
s ( t),<br />
a ( t −1))<br />
as activation<br />
f : R → R with y ( t)<br />
=<br />
n<br />
i<br />
in<br />
i<br />
( t))<br />
∈ R<br />
i<br />
n<br />
as input vector at time t<br />
as weighting vector at time t<br />
f ( a ( t))<br />
as output<br />
i<br />
n<br />
i<br />
i<br />
i<br />
function to provide the input signal s ( t)<br />
function to calculate the activation level a ( t)<br />
function to calculate the output y ( t)<br />
i<br />
i<br />
i<br />
Formula 5-19: Definition of Neurons<br />
75
V ⊆ N × N is a set of directed connections ( n , n )<br />
{ F : n ∈ N}<br />
is a set of learning functions,<br />
which calculate new weightings<br />
w ( t ) = F ( W ( t ), y(<br />
t ), a(<br />
t ), d)<br />
i<br />
i<br />
2<br />
i<br />
i<br />
1<br />
1<br />
1<br />
i<br />
j<br />
for the neurons :<br />
with<br />
d = aimed output vector ( not necessary in case of<br />
W = weighting matrix<br />
y = output vector<br />
a = activation vector<br />
a selforganized network ( see below)<br />
Formula 5-20: Definition of V and F<br />
Figure 5-6 pictures the above given definition of an artificial neuron. Due to that<br />
functioning, an artificial neural network is similar to a Petri net but dynamic, because<br />
in- and output can vary over time.<br />
Figure 5-6: Functioning of an Artificial Neuron ([Hagen97], p. 8) (adapted)<br />
Just like a human brain, artificial neural networks have to learn. Within this<br />
initialization part, training data is applied to an untrained network to determine the<br />
weightings. These weightings will remain unchanged, if the initialization part is<br />
completed. In most cases, a representative part of the whole available data is taken<br />
as training data. ([Lusti02], p. 320-322)<br />
In general, two approaches of learning do exist:<br />
• Supervised learning<br />
• Unsupervised learning<br />
76
The general idea of supervised learning is a feedback of already known results. This<br />
means that during initialization not only inputs are provided but also aimed results.<br />
Hence, the neural network is able to adapt weightings to these aimed results.<br />
Offering these results could be done in two ways. First of all, historical data could be<br />
used that already contains results (e.g. a forecast done by the network can be<br />
evaluated by comparing it to the actually occurred value). The other possibility of<br />
supervised learning is the usage of a trainer. This trainer evaluates the results of<br />
training inputs and rates them. These ratings signalize the network, how weightings<br />
have to be changed. ([Heuer97], p. 16-17)<br />
Hence supervised learning is done by reacting on errors. A common learning<br />
approach is the usage of the delta rule. As described above, the neural network<br />
determines an output vector y to a given input vector x. Moreover, vector d must be<br />
given, which contains the aimed results. To be able to apply the delta rule, the<br />
magnitude of error has to be calculated by using the following Formula 5-21:<br />
([Hagen97], p. 22-23)<br />
δ = d − y<br />
i<br />
i<br />
i<br />
with<br />
δ<br />
i<br />
= Error<br />
d = Aimed result<br />
i<br />
y = Calculated Output<br />
i<br />
i =1,<br />
K,<br />
n<br />
Formula 5-21: Determination of Error<br />
As described above, this error is used to adapt the weightings between the single<br />
neurons. Formula 5-22 contains the often used delta rule that shall exemplify<br />
supervised learning.<br />
77
w ( t + 1) = w ( t)<br />
+ α ⋅(<br />
d ( t)<br />
− y ( t)<br />
⋅ x ( t)<br />
= w ( t)<br />
+ α ⋅δ<br />
( t)<br />
⋅ x ( t)<br />
ij<br />
ij<br />
i<br />
i<br />
j<br />
ij<br />
i<br />
j<br />
with<br />
w<br />
α > 0 = Learning rate<br />
δ = Error<br />
x = Given input<br />
d<br />
i<br />
i<br />
i<br />
= Weighting of connection<br />
= Aimed result<br />
y = Calculated Output<br />
i<br />
ij<br />
i = 1, K,<br />
n<br />
from n<br />
j<br />
to n<br />
i<br />
Formula 5-22: The Delta Rule<br />
Unsupervised learning has to be used, if only the question of data analysis but not<br />
the result is available. The general idea is again to train the network with sample<br />
patterns. But this time, the network has to find and evaluate structures itself. A<br />
needed requirement is redundancy within the input vector. The more redundancy, the<br />
better the training results because it allows the identification of noise and<br />
disturbances. ([Heuer97], p. 18; [Hagen97], p. 19)<br />
Most unsupervised approaches use Hebb learning. This principle is adopted from the<br />
human brain because the weighting of the connection between two active neurons<br />
increases. Hebb defined that the weighting is proportional to the product of the two<br />
neurons’ outputs. Formula 5-23 summarizes this approach. ([Hagen97], p. 20)<br />
w ( t + 1) = w ( t)<br />
+ α ⋅ y ( t)<br />
⋅ y ( t)<br />
ij<br />
ij<br />
i<br />
j<br />
with<br />
w<br />
= Weighting of connection<br />
α > 0 = Learning rate<br />
y = Calculated Output<br />
i<br />
ij<br />
from n<br />
j<br />
to n<br />
i<br />
Formula 5-23: Hebb Learning Rule<br />
5.9.3 Non-Applicability of Artificial Neural Networks to Current Datasets<br />
Although many different specific artificial neural networks do exist, they are always<br />
based on either supervised or unsupervised learning methods. An application of an<br />
unsupervised artificial neural network to currently obtained temperature data is not<br />
possible, because each input vector would only contain a time, a temperature and<br />
78
sometimes data of a door opening sensor. These three variables always contain a<br />
different kind of information, so that each input vector is free of redundancy. But as<br />
mentioned in the last section, redundancy is necessary to identify structures or<br />
patterns in case of unsupervised learning.<br />
By contrast, supervised artificial neural networks are used within other settings to<br />
classify the condition of a monitored device. But always certain preconditions have to<br />
be kept. A neural network is able to predict upcoming failures of pumps for instance.<br />
This is possible because a pump shows a nearly constant behavior. Typical slight<br />
changes in behavior over time that indicate upcoming malfunction could be learned<br />
by an artificial neural network, because every device behaves nearly the same way.<br />
(e.g. [Hawibowo97], chapter 5)<br />
At the moment, a problem of appliance is that the provided datasets from the UMC<br />
St. Radboud only cover the time range of about a year. Moreover, not a single<br />
technical malfunction occurred during that year, 61 so that this data would be<br />
insufficient to train an artificial neural network on the recognition of technical<br />
malfunctions.<br />
But even, if the datasets would contain some errors, a general problem is again the<br />
very small quantity of real malfunctions. 62 This could lead to a learning behavior that<br />
ignores malfunctions because of very low weightings of the corresponding edges<br />
within the network.<br />
5.10 Promising Analyzing Methods<br />
As neither the generalized approach from section 4.3.2 nor other approaches are<br />
directly applicable to the setting of sensor based temperature monitoring, this section<br />
will combine collected ideas to promising analyzing methods. Chapter 6 will apply<br />
these suggested approaches to data from the UMC St. Radboud and will review<br />
them according to the requirements analysis.<br />
5.10.1 Promising Appliance of Basic Descriptive Statistics<br />
Section 5.3 introduced the most common descriptive statistical measures. As the<br />
basic ones are applicable to all kinds of stored numerical data, this section will<br />
61 See section 2.2.5 for details<br />
62 See section 2.2.5 for details<br />
79
identify the probable information gain by using descriptive statistics to better comply<br />
with the determined requirements. 63<br />
The basic idea is to detect changes in general behavior by comparison of basic<br />
measures from different succeeding time intervals. This time interval can generally<br />
be chosen freely. As this diploma thesis assumes data of at least several months, a<br />
time interval of one day per value leads to meaningful results. 64<br />
The smaller a time interval is chosen, the higher is the influence of new<br />
measurement values. On the other hand, too small time intervals, like hours for<br />
instance, could lead to significantly deviating results, even if the behavior of the<br />
monitored device did not change. This problem is mostly caused by the random<br />
behavior of employees. Even a chosen time interval of one day per value can lead to<br />
deviating results, if the user behavior differs significantly. To exclude random user<br />
behavior, a division of the analyzing task into daytime- and nighttime data analysis is<br />
promising.<br />
The door sensor data can be used to define the daily daytime and nighttime intervals.<br />
Daytime starts with the first door opening and ends a defined time range after the last<br />
door opening. This allows, for instance, a comparison of nighttime or daytime mean<br />
values of different days. As the nighttime values are not influenced by random user<br />
behavior, they should be very similar. Variances in daytime values could indicate<br />
employee deviance.<br />
Based on this idea, the aimed goal to gain additional knowledge can be added to<br />
XiltriX (or of course as well to any other monitoring system) by implementing an<br />
automated notification service. This service should calculate and compare daily<br />
daytime and nighttime values of minimum, maximum, mean and standard deviation.<br />
Median and Mode values are not promising due to their ignorance of outliers. 65<br />
In case of a significant change of one of these values from one day to another, the<br />
person in charge should be notified. A classification as significant change should be<br />
done, as soon as the daily change is higher than delta times the regular changing<br />
behavior. To be able to define this behavior, historical data is needed. In general,<br />
data of a few days may suffice, but data of a longer time span probably assures the<br />
gained results.<br />
63 See section 2.5 for details<br />
64 See section 6.2.1 for details<br />
65 See section 5.3 for details<br />
80
Beside the introduced notification service, also a comparison of the door openings’<br />
quantity to other devices and a graphical distribution of the stored temperature<br />
sequence should be added. The analysis of door openings could be used to optimize<br />
the usage of cooling devices. If, for example, two freezers of the same type do exist,<br />
they should have about the same quantity of door openings in the mean. Otherwise,<br />
some often accessed contents should be stored to the device with less door<br />
openings to improve the cooling behavior of both freezers.<br />
The graphical distribution could be used to get brief information of the cooling<br />
device’s accuracy. A very small distribution with a high peak indicates a very<br />
accurate behavior with little deviation. By contrast, a broad deviation or several<br />
smaller peaks may indicate a very inaccurate behavior. 66 This suggested distribution<br />
should offer highly aggregate information to evaluate the general behavior of the<br />
corresponding cooling device at a glance.<br />
Section 5.10.4 will review probable improvements by applying basic statistics and<br />
consecutively introduced ideas to the setting of sensor based temperature<br />
monitoring. The case study in chapter 6 applies these ideas to a sample dataset to<br />
validate these estimated improvements.<br />
5.10.2 Detection of Changes in Behavior by the Use of Regression<br />
The last section introduced a promising possibility to detect changes in behavior on<br />
daily basis by the use of basic statistical measures. Also promising is the use of<br />
regression. Section 5.4.2 pointed out that regression can be used to describe a<br />
temperature sequence by determining a regression function. This possibility leads to<br />
two general ideas:<br />
1. The comparison of single cooling cycles to each other<br />
2. The determination of a trend in general behavior on the long run<br />
To compare single cooling cycles to each other, a polynomial regression function<br />
could be determined for a single representative cooling cycle. Afterwards, the<br />
coefficient of determination could be used to calculate the fit of other cycles to this<br />
regression function. 67 In case of a significant change in fit, a general change in<br />
behavior could be discovered.<br />
66 See section 6.2.1 for details<br />
67 See section 5.4.2 for details<br />
81
This idea would allow recognition of changes within very short time. Nevertheless, it<br />
is not promising because the single cooling cycles differ from each other due to<br />
technical and other reasons. 68 Hence, an appliance of this method would lead to a<br />
high quantity of additional false alarms.<br />
The second idea is based on the assumption that temperature sequences of cooling<br />
devices may not contain a trend on the long-run. To obtain a presumable trend, a<br />
linear regression function could be used. In case of a good fit (coefficient of<br />
correlation ≥ 0.9) 69 the gradient of the regression function determines the trend. As<br />
already mentioned, this method is only promising on the long-run, because linear<br />
regression functions on the short-run are highly influenced by outliers. 70 This would<br />
lead to a too small coefficient of determination.<br />
5.10.3 Classification by Using Past Behavior<br />
As a real malfunction is very improbable and the current number of alarms leads to a<br />
loss of credibility, more system states have to be made up than just “OK” and<br />
“Malfunctioning”. 71 Moreover, most alarms are user made due to door openings<br />
[Nijmegen06]. An appliance of data mining methods may provide these additional<br />
system states. The main idea is, to compare current behavior to similar situations in<br />
the past and the succeeding developing.<br />
Therefore, classification of alarms into different levels (e.g. green, yellow, red) could<br />
be used to indicate how critical a current temperature exceeding is. To achieve such<br />
a classification, every alarming situation is compared to all other situations in the<br />
past. The general assumption is that an alarm is only classified as a red one, if it is<br />
significantly different to most previous ones. Furthermore, alarms that cannot be<br />
connected to a previous door opening should immediately be classified as a red<br />
alarm. To be able to classify the door made alarms, different criteria have to be<br />
found.<br />
A promising suggestion is the usage of the following criteria:<br />
• The duration of door openings<br />
• The maximum temperature during an alarm<br />
• The maximum duration of an alarm<br />
68 See section 2.3 for details<br />
69 See section 5.4.2 for details<br />
70 See Figure 5-2 for details<br />
71 See section 4.3.2 for details<br />
82
A classification could be achieved by calculating the probability of the actual<br />
situation, based on historical data. The underlying assumption is that a situation that<br />
is more exceptional than any other in the past indicates a critical situation of a<br />
hundred percent probability. If, for instance, a current door opening already takes<br />
more than one minute, a comparison to past values could conclude that ninety<br />
percent of all door openings took less time. Hence, the probability that the current<br />
situation is critical is ninety percent. To put this value on a larger basis, the maximum<br />
probability of all three criteria should be taken.<br />
The probability of a critical situation could be used to define the above suggested<br />
alarm levels:<br />
• Green alarm: probability of a critical situation ≥ 50%<br />
• Yellow alarm: probability of a critical situation ≥ 75%<br />
• Red alarm: probability of a critical situation ≥ 95%<br />
This definition is only used exemplarily and can be adapted to other values.<br />
Especially the made assumption that probabilities < 50% shall not be classified as<br />
alarms, although the critical temperature level is exceeded, might be problematic in<br />
some settings. But this assumption saves a lot of alarms without increasing the risk<br />
too much. 72<br />
Using classification like this offers the person in charge additional operational<br />
decision support, whether an occurring alarm has to be taken seriously or not.<br />
Section 6.2.3 will apply this method to a sample dataset to point out the possible<br />
improvements.<br />
5.10.4 Review<br />
The last three sections introduced promising ideas to improve the current monitoring<br />
situation. This section will now review the expected improvements. Whether these<br />
methods really lead to the expected gain of information will be reviewed in chapter 6<br />
by applying them to a sample dataset.<br />
Section 5.10.1 pointed out, that the appliance of descriptive statistics to monitoring<br />
data might be used to recognize significant changes of general cooling behavior on<br />
daily basis. Especially the analysis of daily nighttime values could recognize changes<br />
72 See section 6.2.3 for details<br />
83
that are not caused by user interaction. This improvement would extend the currently<br />
limited possibility to recognize general changes on the short-run. 73<br />
In addition to that, the suggested analysis of door openings might offer the possibility<br />
to optimize the usage of cooling devices. Although this was not part of the<br />
requirements analysis, a relief of frequently opened devices could lead to a better<br />
cooling behavior and less alarms. Also the suggested graphical distribution cannot<br />
improve factors of the requirement analysis. But it could indicate the cooling device’s<br />
accuracy and behavior at a glance. This additional knowledge might support<br />
decisions in case of uncertainty of the cooling device’s general condition.<br />
Section 5.10.2 presented a promising way to discover a trend by the use of<br />
regression. Such a trend would notify a change in general behavior on the long-run.<br />
Hence, a combination of basic statistics and regression could lead to a limited ability<br />
to predict upcoming failures. In fact, a definite <strong>prediction</strong> is not possible due to the<br />
very low probability and the lack of information. 74 But a detected change may hint a<br />
person in charge to have a closer look at that corresponding cooling device.<br />
These estimated improvements can even be extended by using not only statistical<br />
analysis, but also data mining. The suggested classification from section 5.10.3<br />
establishes additional system states besides “OK” and “Malfunctioning”. These states<br />
could allow a more accurate description of the current system state because an<br />
alarm is rated. Based on these ratings, a person in charge could react to a<br />
temperature exceeding in a better way. Moreover, external influences are partly<br />
recognized because the classification of alarms also depends on the occurrence of<br />
door openings in advance of a temperature exceeding.<br />
Hence, a combination of statistical analysis and data mining is promising to<br />
significantly improve the current monitoring situation. The estimated improvements<br />
are summarized again in Table 5-1. Blue dashed arrows represent estimated<br />
improvements by the use of statistical analysis and magenta dashed arrows<br />
represent estimated improvements by the use of data mining.<br />
73 See section 6.3 for details<br />
74 See sections 2.4.1, 5.4.2 and 5.9.3 for details<br />
84
Table 5-1: Estimated Improvements<br />
Approach is able to classify the current state of a monitored<br />
device<br />
Approach is able to recognize significant changes of behavior<br />
on the short-run<br />
Approach is able to recognize significant changes of behavior<br />
on the long-run<br />
Approach is able to predict upcoming failures<br />
Approach is able to identify failures as soon as they are<br />
recognizable<br />
Approach is able to avoid an error of second kind in any case<br />
Approach is able to recognize external influences<br />
Approach is able to optimize the usage of<br />
cooling devices<br />
Approach offers a very quick overview of the<br />
cooling device’s accuracy<br />
The succeeding chapter 6 will present an appliance of the introduced methods to a<br />
selected sample dataset from the UMC St. Radboud to evaluate the real<br />
improvements in practice.<br />
85
6 Implementation and Case Study<br />
This chapter will apply the promising analyzing methods from section 5.10 to a<br />
selected dataset from the UMC St. Radboud. Therefore, section 6.1 will introduce the<br />
major problems and the actually found solutions to perform the data analysis task of<br />
the exported XiltriX data. Afterwards, the calculated results will be presented. A<br />
review of the information gain according to the just defined estimated improvements<br />
will conclude this chapter.<br />
6.1 Implementation of Promising Analyzing Methods<br />
Section 3.2.1 already introduced the stored data of XiltriX. The export functionality<br />
allows a storage of this data to disk as a comma separated value file (CSV). An<br />
excerpt of such a file is pictured in Figure 6-1.<br />
Figure 6-1: Exported XiltriX Data (An Excerpt)<br />
CSV is a standard file format and should be read by a large number of programs. In<br />
fact, the import of this data to other programs is problematic due to the following two<br />
reasons:<br />
• Programs failed to import the string based date and time correctly<br />
• Programs failed to manage the occurring large data sets<br />
Table 6-1 contains a listing of tested programs and the existing problems:<br />
Table 6-1: Import Problems of Tested Software Products<br />
Origin<br />
Euler<br />
Rt-Plot<br />
FreeMat<br />
MS Excel<br />
7.5<br />
2.4<br />
2.7<br />
2.0<br />
2002<br />
Product is able to manage<br />
the occurring large datasets<br />
Product is able to import date<br />
and time correctly<br />
86
As all these software products fail to offer a satisfying solution, Matlab is used in the<br />
following to implement the suggested methods. In fact, also Matlab has some<br />
problems to import the original datasets, but these problems can easily be solved by<br />
changing some delimiters in the CSV file. 75 Moreover, Matlab is capable of importing<br />
date and time correctly and is able to process very large datasets.<br />
In the following, the general ideas of the made implementation will be introduced.<br />
The technical realization and annotations to occurred problems, due to Matlab’s<br />
limited programming possibilities, can be found in the appendix.<br />
Caused by the storage behavior of XiltriX 76 the stored values contain different time<br />
intervals. An example of this behavior is pictured in Figure 6-1. But the suggested<br />
analyzing methods from section 5.10 assume constant time ranges. Hence, the first<br />
step of data analysis is an interpolation of the stored datasets.<br />
The basic idea is, to create new datasets of measurement values that contain regular<br />
time intervals. Door openings are stored to the beginning of the minute, in which they<br />
occur. A combination of the original and the interpolated datasets can be used to<br />
calculate the desired values, as described in the following, without making<br />
adaptations to the described methods. Certainly, in case of an implementation to<br />
XiltriX, its storage behavior should be adapted, so that interpolation is not necessary<br />
any more.<br />
After this interpolation, the desired statistical measures can be calculated. Therefore,<br />
the original and the interpolated datasets are divided into single days and these days<br />
again into daytime and nighttime. As described in section 5.10.1, daytime limits can<br />
be obtained by analyzing the door openings. Based on these made classes, the<br />
promising measures maximum, minimum, mean, standard deviation and the number<br />
of door openings can be calculated on the aimed basis of daytime, nighttime and<br />
whole day.<br />
To obtain correct results, the calculation of minimum and maximum values has to be<br />
based on the original data to avoid smoothing. By contrast, mean as well as standard<br />
deviation has to be based on the interpolated data to achieve a correct weighting in<br />
time. The determination of door openings can be based on both datasets, because<br />
the number of door openings remains unchanged after interpolation. The also aimed<br />
goal, to plot a temperature distribution, can be implemented easily by just counting<br />
75 See appendix for details<br />
76 See section 3.2.1 for details<br />
87
the number of occurrences of the single temperature values within the interpolated<br />
dataset. To allow advanced data analysis of all these calculated results, an export is<br />
done to Microsoft Excel files. Moreover, graphs are plotted and exported as TIF<br />
graphic files. 77<br />
Beside this statistical calculation, section 5.10.2 introduced the promising way of<br />
using linear regression. The functionality to calculate common kinds of regression<br />
functions is built-in to Matlab. Also the needed coefficient of determination can be<br />
obtained. 78 Hence, a self-made implementation for this kind of statistical analysis is<br />
not necessary.<br />
By contrast, the suggested data mining methods from section 5.10.3 have to be<br />
implemented again. The first step of the aimed classification is the identification of<br />
alarms. This is done by scanning the interpolated dataset on a temperature<br />
exceeding. As soon as such an exceeding is found, the next uncritical value is looked<br />
up. The time interval between these two values is classified as alarm. To determine<br />
just the alarms that were caused by a door opening, only those intervals are<br />
classified as an alarm that have a door opening within a predefined offset time. After<br />
the identification of the single alarm intervals, maximum alarm temperatures and the<br />
alarm durations can be collected to calculate the corresponding probability values.<br />
The calculation of door opening’s durations is more complex because the exact<br />
duration can only be obtained from the non interpolated datasets. Moreover, only<br />
door openings should be recognized that lead to an alarm. Hence, only the durations<br />
of alarms in offset time should be calculated and collected. To achieve that, the found<br />
door openings in offset time of alarms are looked up in the original data to determine<br />
their exact duration.<br />
This collected information of maximum temperatures, alarm durations and duration of<br />
door openings is used, afterwards, to determine the limits for the single classification<br />
classes, as exemplified in Table 6-5 on page 100. 79 As already mentioned, the<br />
technical realization and occurred technical problems during implementation time can<br />
be found in the appendix. This chapter will continue with the appliance of these ideas<br />
to a selected sample dataset from the UMC St. Radboud.<br />
77 See appendix for details<br />
78 See section 5.4.2 for details<br />
79 See section 6.2.2 for details<br />
88
6.2 Case Study<br />
The UMC St. Radboud provided 36 datasets. But none of them contains a real<br />
technical malfunction. Moreover, only ten of these datasets contain data of a<br />
connected door opening sensor. To be able to apply the suggested classification<br />
method, door sensor data is needed to determine, whether a temperature exceeding<br />
was caused by a door opening or not. Hence, only one out of these ten datasets can<br />
be chosen as sample dataset.<br />
Figure 6-2 pictures the actually selected dataset. It was selected because it contains<br />
several interesting factors. First of all, the set maximum temperature level was<br />
changed in March 2006, to reduce the quantity of false alarms (indicated by the red<br />
dashed line) [Nijmengen06]. Moreover, this temperature pattern contains eyecatching<br />
behavior. Beside some very high peaks, especially the global minimum,<br />
occurred September 22 nd , is eye-catching, because this behavior is unique within the<br />
whole time span. In addition to that, a change of cooling behavior of about half a<br />
degree in the mean took place on the long run.<br />
Figure 6-2: Temperature Overview of the Selected Sample Dataset<br />
Aside from these interesting factors, all door openings took place between 6 o’ clock<br />
in the morning and 10 o’ clock in the evening. This will be declared as daytime within<br />
89
this example. Hence, nighttime data of this chosen dataset is free from external user<br />
influences.<br />
6.2.1 Detection of Changes in Behavior by Using Descriptive Statistics<br />
Section 5.10.1 introduced a promising way of recognizing changes in behavior by just<br />
comparing basic descriptive statistical measures. This section will now present<br />
calculated results for the selected dataset. As described in section 5.10.1, the<br />
suggested notification service compares succeeding daily daytime and nighttime<br />
values and reports irregular ones. This irregularity was defined as delta times the<br />
mean change. To obtain a better feeling for this delta, the calculated results are first<br />
of all presented in graphical form. Afterwards, two different deltas are chosen and<br />
their corresponding notifications are calculated. 80<br />
The following figures 6-3 to 6-10 contain the calculated results of daily day- and<br />
nighttime values of the promising basic measurements. 81 To allow an easy<br />
comparison of daytime and nighttime values, they are plotted with the same scale.<br />
The results of the also suggested whole day analysis are not pictured because they<br />
do not differ significantly from the daytime results.<br />
The daytime maximum values are very irregular. In fact, they are almost comparable<br />
to the general temperature overview. But eye-catching is a very low maximum value<br />
at the end of November. The nighttime values offer a much clearer overview of the<br />
system’s general behavior due to the missing of external user influences. But again,<br />
the graph contains the eye-catching low temperature. As this exceptional value is<br />
also recognizable at nighttime, it should be reported to a person in charge. A closer<br />
look at Figure 6-2 discovers a real change of cooling behavior within that time span<br />
of nearly two days, so that this notification should be done.<br />
Beside this exceptional value the maximum nighttime values are faced with another<br />
significant change at the beginning of January. Eye-catching is, furthermore, the<br />
behavior at the end of January. Within very few days, the daily nighttime maximums<br />
increased about half a degree and nearly remained on that level for the rest of the<br />
monitoring time. This significant step also indicates a change in cooling behavior and<br />
should be notified.<br />
80 See Table 6-2 for details<br />
81 See section 5.10.1 for details<br />
90
Figure 6-3: Maximum Values at Daytime<br />
Figure 6-4: Maximum Values at Nighttime<br />
91
Daytime and nighttime minimum values are very similar to each other. The reason for<br />
this similarity is based on the kind of external user influences. Most influences are<br />
caused by door openings or the insertion of warm samples, so that the minimum<br />
temperature remains uninfluenced. 82<br />
The calculation of daily minimum values also contains several remarkable changes.<br />
First of all, daytime and nighttime values contain an exceptional change in minimum<br />
temperature of more than 1°C at the end of September. Regular changes from one<br />
day to another are normally at most 0.2°C, so that this change should be notified. In<br />
fact, this change in minimum temperature indicates the already introduced global<br />
minimum temperature that occurred on September, 22 nd .<br />
In addition to that, some other remarkable changes do exist. The first one is a rise in<br />
minimum temperature about one week, before the global minimum occurs. Moreover,<br />
the already mentioned change at the end of November is also eye-catching again. As<br />
maximum as well as minimum temperatures are remarkably different, a change of<br />
general temperature level must have taken place. The last eye-catching factor is that<br />
the calculated minimum values contain a general trend.<br />
The mean daytime and nighttime values appear to be very similar at first sight. But a<br />
closer look discovers some significant higher peaks in daytime data, which are not<br />
recognizable at nighttime. The reason for these peaks cannot be obtained for sure,<br />
but presumably they are caused again by external user influences like door<br />
openings.<br />
But even the uninfluenced nighttime values are faced with higher variations than the<br />
already reviewed minimum and maximum temperatures. This complicates the<br />
identification of significant changes. But again the changes from the end of<br />
September and November are recognizable. Moreover, the mean values also contain<br />
the mentioned trend.<br />
The last promising measure is the standard deviation. Again, the daytime values<br />
contain several high peaks that have to be traced back on door openings. By<br />
contrast, the graph of the standard deviation at nighttime indicates the changes from<br />
the end of September and November more clearly, than any other introduced<br />
measure.<br />
82 See section 2.4.2 for details<br />
92
Figure 6-5: Minimum Values at Daytime<br />
Figure 6-6: Minimum Values at Nighttime<br />
93
Figure 6-7: Mean Values at Daytime<br />
Figure 6-8: Mean Values at Nighttime<br />
94
Figure 6-9: Standard Deviation at Daytime<br />
Figure 6-10: Standard Deviation at Nighttime<br />
95
The visual analysis of these graphs discovered that the selected dataset contains<br />
several changes in behavior. Most significant are the global minimum on September,<br />
22 nd , and the change in temperature level on the end of November. As well eyecatching<br />
but not that significant are the mentioned changes right in the middle of<br />
September and the general rise in temperature in the year 2006. Looking at the<br />
graphs indicated, furthermore, that the notification of changes based on daytime data<br />
is not promising, due to the high number of external influences.<br />
After this graphical overview the numerical analysis will be evaluated by testing two<br />
different deltas. Meaningful results can be obtained by choosing five or ten as delta.<br />
The lower delta leads to earlier notifications, the higher delta only notifies higher<br />
deviations. Table 6-2 pictures the mean deviations in nighttime data as well as the<br />
minimum deviations that lead to a notification using the corresponding delta.<br />
Table 6-2: The Chosen Deltas<br />
Mean Deviation 5 x Mean Deviation 10 x Mean Deviation<br />
Maximum 0,027 0,135 0,27<br />
Minimum 0,043 0,215 0,43<br />
Mean 0,035 0,175 0,35<br />
Standard Deviation 0,011 0,055 0,11<br />
Table 6-3 pictures the calculated results. A yellow marked cell indicates a notification,<br />
due to a delta of five. If a delta of ten was predefined, only the red marked values<br />
would be notified.<br />
Using a delta of ten would have notified the most eye-catching changes in<br />
September and November. A delta of five would lead to 45 notifications. In fact, most<br />
of them are caused by the standard deviation and are not bound to significant<br />
changes in general behavior. Hence, the standard deviation should only be used with<br />
a high delta or left out. The residual measures can also be used with a delta of five.<br />
The made notifications in July, January and February can actually be traced back on<br />
small changes in cooling behavior, so that such a notification is right.<br />
Hence, comparing nighttime measures from succeeding time intervals enables the<br />
recognition of changes on the short-run (on daily basis). The notification level can be<br />
adapted by choosing a higher or a smaller delta. Presumably, different deltas for<br />
different kinds of machines have to be chosen to find the right balance between too<br />
many and too few notifications.<br />
96
Table 6-3: Reported Notifications (Based on Nighttime Data)<br />
Maximum Minimum Mean Standard Deviation<br />
15.06.2005 0 0 0,1 0,1<br />
18.07.2005 0 0,3 0 0<br />
19.07.2005 0 0 0,2 0,1<br />
21.07.2005 0 0 0,1 0,1<br />
23.07.2005 0 0 0,1 0,1<br />
24.07.2005 0,1 0 0,1 0,1<br />
03.08.2005 0,1 0 0 0,1<br />
04.08.2005 0 0,1 0 0,1<br />
10.08.2005 0 0 0,1 0,1<br />
19.08.2005 0,1 0 0 0,1<br />
20.08.2005 0,1 0 0 0,1<br />
25.08.2005 0 0 0 0,1<br />
26.08.2005 0 0 0 0,1<br />
30.08.2005 0 0,3 0,1 0,1<br />
20.09.2005 0 1,3 0,2 0,3<br />
21.09.2005 0 0 0,4 0<br />
22.09.2005 0,1 1 0,4 0,3<br />
26.09.2005 0 0,1 0 0,1<br />
27.09.2005 0 0,1 0 0,1<br />
24.10.2005 0 0,2 0,2 0<br />
28.11.2005 0 0,3 0,1 0,1<br />
29.11.2005 0,3 0 0,2 0,1<br />
30.11.2005 0,4 0 0,1 0,1<br />
01.12.2005 0,1 0,4 0,3 0,1<br />
05.01.2006 0,3 0 0,2 0<br />
06.01.2006 0,2 0 0,1 0,1<br />
10.01.2006 0 0 0,2 0<br />
11.01.2006 0 0 0 0,1<br />
12.01.2006 0 0 0,1 0,1<br />
24.01.2006 0,2 0,2 0,2 0<br />
26.01.2006 0 0 0 0,1<br />
27.01.2006 0 0 0 0,1<br />
01.02.2006 0 0 0 0,1<br />
07.02.2006 0,1 0,2 0,1 0,1<br />
23.02.2006 0,2 0 0,1 0,1<br />
24.02.2006 0,1 0,1 0,2 0<br />
03.03.2006 0 0 0,1 0,1<br />
08.03.2006 0 0,2 0,1 0,1<br />
11.03.2006 0,1 0 0,1 0,1<br />
12.03.2006 0,1 0,1 0,1 0,1<br />
15.03.2006 0,2 0 0,1 0<br />
01.05.2006 0,2 0 0,1 0<br />
11.05.2006 0 0,2 0,1 0,1<br />
17.05.2006 0,1 0,2 0,1 0,1<br />
23.05.2006 0,2 0 0,1 0<br />
97
Aside from determination of changes on the short-run, section 5.10.1 suggested to<br />
offer visualization possibilities for occurred door openings. Furthermore, a graphical<br />
temperature distribution was suggested to obtain the accuracy of a cooling device.<br />
Figure 6-11 pictures these additional ideas. The overview of door openings allows an<br />
easy comparison of usage to other devices. Moreover, the pictured distribution allows<br />
a very fast overview of the devices accuracy on the long-run: the sharper the peak,<br />
the higher the accuracy. Remarkable at this example is the second peak, which<br />
indicates the significant change in behavior.<br />
Figure 6-11: Daily Door Openings and Temperature Distribution of the Selected Dataset<br />
This section proved that already the simple appliance of basic statistical measures<br />
can discover changes in general behavior. Up to the calculation of these results, the<br />
corresponding cooling device was classified as well running. No one recognized<br />
these changes.<br />
6.2.2 Detection of Changes in Behavior by the Use of Regression<br />
Section 5.10.2 introduced the promising idea to detect changes in general behavior<br />
on the long-run by the use of regression. An appliance of linear regression to the<br />
selected dataset leads to the regression function from Formula 6-1. Remarkable is<br />
the high coefficient of determination, which indicates a very good approximation. 83<br />
Figure 6-12 offers a graphical representation.<br />
yˆ<br />
= 0.0019307x<br />
−1409.6<br />
R<br />
2 =<br />
0.97492<br />
Formula 6-1: Regression Function and Coefficient of Determination<br />
83 See 5.4.2 section for details<br />
98
Important for the determination of the trend is only the gradient of the determined<br />
function. The residual component of the regression function is evoked by Matlab’s<br />
internal representation of the date and can be ignored. This gradient has to be<br />
multiplied by the number of days. As the selected monitoring data contains a time<br />
span of 366 days, a trend of 0.0019307⋅<br />
366 ≈ 0. 7 °C is recognized on the long-run. A<br />
closer look at Figure 6-8 confirms this trend.<br />
Figure 6-12: Regression Function for the Selected Dataset<br />
6.2.3 Classification of Alarms by the Use of Historical Data<br />
Section 5.10.3 introduced the promising idea to classify alarms in case of a<br />
temperature exceeding by the use of historical data. The first introduced step was the<br />
determination, whether an alarm can be traced back to a door opening. Therefore, an<br />
offset time has to be defined, how long the last door opening may be dated back.<br />
Table 6-4 pictures different chosen offset times and the corresponding classification<br />
of alarms. 139 alarms occurred up to one minute, after a door was opened. A defined<br />
offset time of three minutes would lead to only two alarms that would immediately be<br />
classified as red ones and an offset time of ten minutes would lead to the result that<br />
all alarms were user made.<br />
99
Table 6-4: Classification of Alarms<br />
Selected Offset time (Minutes) Number of Alarms Caused by Door Openings<br />
1 139/158<br />
2 155/158<br />
3 156/158<br />
4 157/158<br />
10 158/158<br />
To classify these user made alarms, the suggested classes from section 5.10.3 are<br />
taken. Moreover the suggested criteria are used to determine the current condition.<br />
Table 6-5 contains the corresponding results of data analysis of the selected sample<br />
dataset. Due to the historical behavior, a green alarm will currently be raised in case<br />
of a door opening that takes at least 19 seconds because 50 percent took less time.<br />
Moreover, a temperature of 6.9°C or higher and a temperature exceeding of 7<br />
minutes or more would have the same effect. But as this data is calculated<br />
dynamically, these results only mirror a snapshot.<br />
Table 6-5: Results of Classification According to Single Criterions<br />
Duration of Door<br />
Openings<br />
Maximum Temperature<br />
During an Alarm<br />
Maximum Duration<br />
of an Alarm<br />
Green Alarm ≥ 19 Seconds ≥ 6,9°C ≥ 7 Minutes<br />
Yellow Alarm ≥ 38 Seconds ≥ 7,8 °C ≥ 12 Minutes<br />
Red Alarm ≥ 84 Seconds ≥ 9,8 °C ≥ 27 Minutes<br />
Occurred Red<br />
Alarms<br />
39 8<br />
42<br />
9<br />
Beside current limits for the single alarm classes, the last row of Table 6-5 is of<br />
special interest. It contains the number of red alarms that would have been raised<br />
during the whole monitoring time. This quantity of 42 red alarms is significantly<br />
different to 158, so that more than 60% of all occurred alarms could be classified as<br />
not that critical. But if this classification method is applied to very critical devices,<br />
additional conditions are needed. If, for instance, the temperature level may not<br />
exceed for more than 15 minutes, the red alarm should go off earlier.<br />
100
Remarkable is a comparison to the actually applied method of setting a higher<br />
maximum temperature limit, which was used to reduce the quantity of false alarms. 84<br />
Data analysis discovered that this set temperature limit was exceeded 152 times. In<br />
case of an unchanged temperature limit of 6°C, this number would have increased to<br />
158. Hence, this method saved 6 alarms but increased the notification delay in case<br />
of a real malfunction.<br />
6.3 Review<br />
Section 5.10.4 already pointed out the estimated improvements that might be<br />
achieved by using the suggested statistical and data mining methods. This section<br />
will review whether these estimated improvements really occurred.<br />
First of all, descriptive statistics led to the estimation that the currently limited ability<br />
to detect changes on the short-run may be improved. An appliance to the selected<br />
sample dataset showed that major changes in general cooling behavior were actually<br />
detected. Moreover, an adjustment to different security levels can be achieved by<br />
selecting bigger or smaller deltas. Only the appliance to daytime data does not<br />
provide reliable notifications, so that changes can only be recognized from morning<br />
to morning. But as this method recognized previously unknown irregularities, it<br />
definitely improves the recognition on the short-run.<br />
In addition to that, the appliance of regression provided very good results. The<br />
determined function had a very good fit ( R<br />
2 = 0. 97492 ) and contained a gradient that<br />
described the really occurred temperature increase well. Hence, as long as the<br />
monitoring data is not faced with too many influences that lead to a fit less than 0.9,<br />
this method is able to reliably detect changes in behavior on the long-run.<br />
Section 5.10.4 pointed out that the combination of basic statistics and regression<br />
could lead to a limited ability to predict upcoming failures. In fact, both methods<br />
detected changes in behavior, but the cooling device kept on functioning. Hence, the<br />
gained results could be an indication for an upcoming malfunction but do not have to<br />
be. Moreover, the optimization of the cooling device’s usage by analyzing door<br />
openings cannot be assured within this diploma thesis but has to be tested in<br />
practice.<br />
The appliance of data mining confirmed the estimated improvements. The gain of<br />
additional system states improved the previously limited possibility to classify the<br />
84 See introduction of chapter 6 for details<br />
101
current state of a monitoring device. As this classification also regards door<br />
openings, a limited possibility is achieved to recognize external influences.<br />
Hence, a combination of statistical analysis and data mining is able to significantly<br />
improve the current monitoring situation. The achieved improvements are<br />
summarized again in Table 6-6. Blue arrows represent achieved improvements by<br />
the use of statistical analysis and magenta arrows represent achieved improvements<br />
by the use of data mining. Blue dashed arrows represent estimated improvements by<br />
the use of statistical analysis that cannot be approved for sure, due to the named<br />
reasons.<br />
Table 6-6: Achieved Improvements<br />
Approach is able to classify the current state of a monitored<br />
device<br />
Approach is able to recognize significant changes of behavior<br />
on the short-run<br />
Approach is able to recognize significant changes of behavior<br />
on the long-run<br />
Approach is able to predict upcoming failures<br />
Approach is able to identify failures as soon as they are<br />
recognizable<br />
Approach is able to avoid an error of second kind in any case<br />
Approach is able to recognize external influences<br />
Approach is able to optimize the usage of<br />
cooling devices<br />
Approach offers a very quick overview of the<br />
cooling device’s accuracy<br />
6.4 Recommendations<br />
This diploma thesis pointed out the general problems of currently applied sensor<br />
based temperature monitoring. Most problematic was the very low probability of a<br />
real technical malfunction, compared to irregular temperatures that were caused by<br />
door openings or other external influences. Hence, the idea to just evaluate the<br />
current temperature of a cooling device leads to a large number of false alarms (e.g.<br />
158 for the selected sample dataset within one year)<br />
102
Interviews with several employees from the UMC St. Radboud discovered that<br />
currently no decision support does exist that introduces recommendations, telling<br />
what should be done in case of such an alarm. Not even information of time and<br />
duration of the last door opening is displayed to offer at least a hint on possible user<br />
influence. The only thing an employee can do in case of an alarm is to inspect the<br />
corresponding cooling device manually by having a short look at it.<br />
In fact, the very high quantity of false alarms led to a loss in credibility of XiltriX, so<br />
that employees tend to wait a certain time, after an alarm went off. Only in case of an<br />
enduring alarm for a longer time period or the occurrence of an uncommon high<br />
number of alarms during a short time interval, a manual inspection is really made in<br />
most cases. [Nijmegen06]<br />
As long as the stored contents are not damageable within very few minutes, this<br />
practice is doable. But the estimation, whether the developing of a current alarm is<br />
like most others or not, relies on experience and instinct of the operational staff. The<br />
suggested data mining method to classify the developing of an alarm into different<br />
alarming levels offers a higher reliability, because a decision, whether an alarm has<br />
to be classified as really critical, is based on all available information like door<br />
openings or past time behavior and not on unreliable user made estimations.<br />
Hence, in my point of view this classification method should be added to XiltriX to<br />
offer additional decision support and to reduce the number of demanded inspections.<br />
Highly critical devices may either be excluded from this classification or assigned with<br />
other classification parameters and additional conditions.<br />
<strong>Contell</strong>/IKS confirms the possible improvements but fears that this classification<br />
could lead to even higher user misbehavior, because classifications that are lower<br />
than the highest level might be ignored. Consequently, the current user behavior to<br />
wait for a certain time interval might be applied to the highest classification, so that a<br />
user reaction is delayed to an unacceptable level.<br />
The other major problem of sensor based temperature monitoring was the limited<br />
ability to recognize changes on the short- and long-run. Up to now, only changes are<br />
recognized that are bound to periodically occurring alarms. The suggested methods<br />
to use statistical analysis and regression to determine changes inside normal<br />
temperature range achieved a major improvement of this situation. Only the<br />
determination of an appropriate delta for different kinds of cooling devices still needs<br />
to be done in practice.<br />
103
This recognition of changes in behavior within the operating temperature interval is<br />
currently impossible with all introduced monitoring products, so that this feature<br />
would add a unique selling proposition. Because of the accurate results of these<br />
methods and the argument of gaining a unified selling proposition, <strong>Contell</strong>/IKS is<br />
interested in these methods.<br />
The additional suggested idea to optimize the usage of cooling devices by comparing<br />
the quantity of door openings to each other could not be tested on possible<br />
improvements within this diploma thesis, due to missing testing possibilities. But this<br />
method was also presented to <strong>Contell</strong>/IKS. The person in charge confirmed the<br />
possibility of improvement. But the main focus of <strong>Contell</strong>/IKS will first of all lie in the<br />
implementation of the introduced statistical methods to enable XiltriX to detect<br />
changes in general behavior within normal operation.<br />
Based on these facts, my recommendation is to implement the two statistical and the<br />
data mining method, because all three offer great results. Moreover, the optimization<br />
of the cooling device’s usage should be tested on applicability. But due to the<br />
concerns of user misbehavior, <strong>Contell</strong>/IKS will not focus on the presented data<br />
mining method.<br />
104
7 Summary<br />
Cooling devices within medical laboratories often contain irrecoverable samples that<br />
are part of research work. As a loss of such a sample could lead to a damage of half<br />
a million euro, a warming up of the cooling device’s contents has to be avoided in<br />
any case. Therefore, sensor based temperature monitoring systems are developed to<br />
notify a person in charge as soon as (or even before) a fridge starts to malfunction.<br />
The determination whether a cooling device is malfunctioning or not is currently just<br />
based on the definition of critical temperature values. But this approach causes many<br />
false alarms due to door openings and other external influences. Moreover, the<br />
measurement data is stored mainly for documentation purposes.<br />
The task of research was now to determine, what additional knowledge could be<br />
gained from the stored datasets by using statistics and data mining. Aimed results<br />
were a gaining of additional knowledge of a cooling device’s condition from recorded<br />
datasets to offer additional decision support in case of an exceptional temperature<br />
level. Moreover, a method to reliably predict upcoming malfunctions was aimed.<br />
The research started with an analysis of regular and irregular behavior of cooling<br />
devices. A major result was that every cooling device has a deviating temperature<br />
sequence, due to its technical functioning. Aside from that, the cooling behavior is<br />
disturbed by many environmental influences, mostly caused by user interaction. The<br />
last discovered problem is a lack of information that disables the finding of a heating<br />
up reason in most cases.<br />
The third chapter reviewed XiltriX and other available major monitoring systems.<br />
Some of these systems were kept very simple. Other systems offered many<br />
additional features. But a detailed analysis of all these products discovered, that all of<br />
them were based on the insufficient idea to just set critical temperature limits.<br />
The fourth chapter reviewed the current state of research. As a current research<br />
activity within this setting of sensor based temperature monitoring could not be<br />
discovered, the main focus was kept on the similar settings of machinery condition<br />
monitoring and the measurement data analysis. Remarkable is the introduction of a<br />
generalized data analysis approach that promised to predict future values of all kinds<br />
of measurement data without any knowledge of the underlying setting. But due to the<br />
high quantity of external influences an appliance of this approach failed.<br />
105
Hence, chapter 5 reviewed other promising approaches from chapter 4 on<br />
applicability. Especially the promising appliance of time series analysis and artificial<br />
neural networks failed, mainly because of the very low probability of a real<br />
malfunction and the missing of training data that contains malfunctions.<br />
By contrast, three methods were identified to improve the current monitoring<br />
situation. The first one is based on the statistical measures minimum, maximum,<br />
mean and standard deviation. The basic idea is to detect changes in general cooling<br />
behavior by comparison of these measures from succeeding time intervals. As soon<br />
as a change is significantly higher than average, the user should be notified of this<br />
change on the short-run. To avoid too many false notifications, only the uninfluenced<br />
nighttime data is used, which can be determined by the use of a door opening<br />
sensor.<br />
Moreover, linear regression could be used to determine a trend on the long-run.<br />
Although the temperature data is not linear, the achieved fit is sufficient to get reliable<br />
results. The last identified method is based on data mining. The general idea is, to<br />
compare current behavior to similar situations in the past and the succeeding<br />
developing. This enables a classification into different alarming levels.<br />
Chapter 6 introduced the implementation of the identified methods to Matlab and<br />
applies it to a selected sample dataset from the UMC St. Radboud (University<br />
hospital of Nijmegen, the Netherlands). As a result, the combination of statistical<br />
analysis and data mining is able to significantly improve the current monitoring<br />
situation. Changes in behavior on the short-run can be discovered, by comparing<br />
daily statistical measures. Moreover, regression can be used to determine changes<br />
in cooling behavior on the long-run.<br />
Using the suggested classification, leads to the gained additional decision support in<br />
case of a temperature exceeding. Only the aimed goal, to reliably predict upcoming<br />
failures can not be achieved because of unrecognizable external influences and the<br />
very low probability of a real technical malfunction. But the recognition of changes in<br />
cooling behavior might hint an upcoming malfunction automatically, so that also this<br />
aimed goal is at least partly reached.<br />
106
Bibliography<br />
Books and Articles:<br />
[Beichelt97] Frank Beichelt, Stochastische Prozesse für Ingenieure, B.G.<br />
Teubner, Stuttgart, 1. Edition, 1997<br />
[Benker01] Hans Benker, Statistik mit Mathcad und Matlab, Springer Verlag,<br />
Berlin, 1. Edition, 2001<br />
[Berthold99] Michael Berthold & David J. Hand, Intelligent Data Analysis,<br />
Springer Verlag, Berlin, 1. Edition, 1999<br />
[Blasig95] Reinhard Blasig, Neuronale Netze und die Induktion<br />
symbolischer Klassifikationsregeln, Dissertation, Universität<br />
Kaiserslautern, 1995<br />
[Bohnekamp97] H. Bonekamp, Monitor to Guard Fridge Temperature, In: Elektor<br />
Electronics, Canterbury: Elektro Publ. Ltd, ISSN 0308-308X, 23,<br />
p. 58-61, 1997<br />
[Bourier03] Günther Bourier, Beschreibende Statistik, Gabler Verlag,<br />
Wiesbaden, 5. Edition, 2003<br />
[Chatfield04] Chris Chatfield, The Analysis of Time Series, Chapman &<br />
Hall/CRC, Boca Raton (Florida), Sixth Edition, 2004<br />
[Daßler95] Frank Daßler, Tendenztriggerung – Meßdatenanalyse im on-line-<br />
Betrieb mit dem Ziel der frühzeitigen Erkennung und Vorhersage<br />
von Daten, Trends und Störungen, Dissertation, TU Chemnitz-<br />
Zwickau, 1995<br />
[Eckey02] Hans-Friedrich Eckey & Reinhold Kosfeld & Christian Dreger,<br />
Statistik, Gabler Verlag, Wiesbaden, 3. Edition, 2002<br />
[Gentle02] James E. Gentle, Elements of Computational Statistics, Springer<br />
Verlag, New York, 1. Edition, 2002<br />
[Hagen97] Claudia Hagen, Neuronale Netze zur statistischen Datenanalyse,<br />
Dissertation, Technische Hochschule Darmstadt, 1997<br />
[Hawibowo97] Singgih Hawibowo, Sicherheitstechnische Abschätzung des<br />
Betriebszustandes von Pumpen zur Schadensfrüherkennung,<br />
Dissertation, Technische Universität Berlin, 1997<br />
[Heinzelmann99] Dipl.-Ing. Andreas Heinzelmann, Produktintegrierte Diagnose<br />
komplexer mobiler Systeme, VDI Verlag, Düsseldorf, VDI Reihe<br />
12, Nr. 391, 1999<br />
[Heuer97] Jürgen Heuer, Neuronale Netze in der Industrie, Gabler Verlag,<br />
Wiesbaden, 1. Edition, 1997<br />
107
[Holland01] Heinrich Holland & Kurt Scharnbacher, Grundlagen der Statistik,<br />
Gabler Verlag, Wiesbaden, 5. Edition, 2001<br />
[Jondral02] Friedrich Jondral & Anne Wiesler, Wahrscheinlichkeitsrechnung<br />
und stochastische Prozesse, B.G. Teubner, Stuttgart, 2. Edition,<br />
2002<br />
[Kolerus95] Josef Kolerus, Zustandsüberwachung von Maschinen, Expert<br />
Verlag, Renningen-Malmsheim, 2. Edition, 1995<br />
[Krallmann05] Jens Krallmann, Einsatz eines Multisensors für ein Condition<br />
Monitoring von mobilen Arbeitsmaschinen, Dissertation, TU<br />
Braunschweig, 2005<br />
[Krems94] Josef F. Krems, Wissensbasierte Urteilsbildung, Hans Huber<br />
Verlag, Göttingen, 1. Edition, 1994<br />
[Lusti02] Markus Lusti, Data Warehousing und Data Mining, Springer<br />
Verlag, Berlin, 2. Edition, 2002<br />
[Martin98] Wolfgang Martin, Data Warehousing – Data Mining – OLAP,<br />
Thomson Publishing International, Bonn, 1. Edition, 1998<br />
[Masing88] Dr. Walter Masing, Handbuch der Qualitätssicherung, Carl<br />
Hanser Verlag, München, 2. Edition, 1988<br />
[Multhaupt00] Marko Multhaupt, Data Mining und Text Mining im strategischen<br />
Controlling, Shaker Verlag, Aachen, 1. Edition, 2000<br />
[Nauth05] Peter Nauth, Embedded Intelligent Systems, Oldenbourg Verlag,<br />
München, 1. Edition, 2005<br />
[Pitter01] Frank Pitter, Verfügbarkeitssteigerung von Werkzeugmaschinen<br />
durch Einsatz mechatronischer Sensorlösungen, Meisenbach<br />
Verlag, Bamberg, 1. Edition, 2001<br />
[Sick00] Dipl.Inform. Bernhard Sick, Signalinterpretation mit Neuronalen<br />
Netzen unter Nutzung von modellbasierten Nebenwissen am<br />
Beispiel der Verschleißüberwachung von Werkzeugen in CNC-<br />
Drehmaschinen, VDI Verlag, Düsseldorf, VDI Reihe 10, Nr. 629,<br />
2000<br />
[Scharnbacher04] Heinrich Holland & Kurt Scharnbacher, Grundlagen statistischer<br />
Wahrscheinlichkeiten, Gabler Verlag, Wiesbaden, 1. Edition,<br />
2004<br />
[Turunen99] Esko Turunen, Mathematics behind Fuzzy Logic, Physica Verlag,<br />
Heidelberg, 1. Edition, 1999<br />
[Waldmann04] Karl-Heinz Waldmann & Ulrike M. Stocker, Stochastische<br />
Modelle, Springer Verlag, Berlin, 1. Edition, 2004<br />
108
[Wittenberg98] Reinhard Wittenberg, Grundlagen computerunterstützter<br />
Datenanalyse – Band 1, Lucius & Lucius, Stuttgart, 2. Edition,<br />
1998<br />
Interviewee:<br />
[Nijmegen06] Several Employees at<br />
UMC St. Radboud (University Hospital of Nijmegen, the<br />
Netherlands)<br />
Date: June 2 nd , 2006<br />
[Weerdesteyn06] Han Weerdesteyn<br />
Product Manager of <strong>Contell</strong>/IKS<br />
WebPages:<br />
[2DI2006]<br />
[3M2006]<br />
[AES06]<br />
[DeltaTRAK06]<br />
[Rees06]<br />
[Triple06]<br />
[UniMunich06]<br />
Two Dimensional Instruments, LLC.<br />
(http://www.e2di.com/thermaviewer.html)<br />
Last visit: November 29 th , 2006<br />
3M Worldwide<br />
(http://solutions.3m.com/wps/portal/3M/en_US/Microbiology/FoodS<br />
afety/products/time-temperature-indicators/)<br />
Last visit: November 28 th , 2006<br />
AES Chemunex<br />
(http://www.aes-labguard.com)<br />
Last visit: November 29 th , 2006<br />
DeltaTRAK<br />
(http://www.deltatrak.com/thermo_cdx.shtml)<br />
Last visit: November 29 th , 2006<br />
Rees Scientific<br />
(http://www.reesscientific.com/Centron.htm)<br />
Last visit: November 29 th , 2006<br />
Triple Red – Laboratory Technology<br />
(http://www.triplered.com/Products/alarms.htm)<br />
Last visit: November 29 th , 2006<br />
University of Munich<br />
(http://leifi.physik.uni-muenchen.de/web_ph09/umwelt_technik<br />
/07kuehlschrank/kuehlschrank.htm)<br />
Last visit: November 8 th , 2006<br />
109
Other Sources:<br />
[DEMO06]<br />
[UMC06]<br />
Exported data and screenshots<br />
<strong>Contell</strong>/IKS demo system<br />
Date of export: June – November, 2006 (according to<br />
requirements)<br />
Exported operating data<br />
UMC St. Radboud (University Hospital of Nijmegen, the<br />
Netherlands)<br />
Date of Export: June 1st, 2001<br />
110
Appendix 1 – Implementation of Interpolation<br />
As already explained in section 6.1 the collected datasets have to be interpolated to<br />
obtain constant time intervals between single measuring values. This interpolation is<br />
done by the following algorithm.<br />
The basic steps of this algorithm are:<br />
1. Import of the monitoring data<br />
2. Conversion of date and time to the right format<br />
3. Interpolation of a measurement value for every single minute<br />
(Number of door openings is stored to the beginning of a minute)<br />
4. Storage of the calculated values to disk<br />
5. Reimport of calculated values from disk for validation purposes<br />
To be able to import the CSV files from XiltriX, they have to be adapted, as already<br />
mentioned in section 6.1. The reason for that is a different usage of delimiters. XiltriX<br />
exports the data with a point as thousands separator and a comma as decimal<br />
separator. Matlab interprets the point as decimal separator and the comma as<br />
separator. To solve this problem, two simple replacements have to be done with a<br />
text editor in the following order:<br />
1. Replace “.” with “”<br />
2. Replace “,” with “.”<br />
An experienced programmer may recognize that the following codes are all coded in<br />
iterative manner and not object-oriented. The reason for that is the limited possibility<br />
Matlab offers. Indeed, it is possible to encapsulate at least procedures to so called M-<br />
files. 85 But they have a significant negative influence on the runtime. This<br />
phenomenon has to be traced back on the internal data exchange behavior. Hence,<br />
a simple iterative structure is used.<br />
Another problem of Matlab is the nonexistence of well scaling data types like linked<br />
lists. Hence, the following algorithm slows down very fast. The first thousand values,<br />
for instance, are calculated in about 45 seconds. That is nearly 10 times faster, than<br />
the second thousand values. The next thousand values take even more calculation<br />
time. 86 As the collected datasets contain about 37000 values, calculation would take<br />
85 See (e.g.[Benker01], p. 48-55)<br />
86 Tests were made with a Pentium 3 mobile, 1GHz, 256MB Ram<br />
A-111
hours to days. The found solution is to store the intermediate data every 250 values<br />
to disk. This leads to a running time of about 26 minutes for a 37000 value dataset.<br />
The actual implementation is printed on the following pages.<br />
A-112
%Name and location of the source file<br />
unit = 1;<br />
filename = strcat('Channel', int2str(unit), '.csv');<br />
path = 'C:\Dokumente und Einstellungen\Christian\Eigene<br />
Dateien\Dokumente\Studium\Diplomarbeit\Monitoring Data Nijmegen<br />
(Converted)\';<br />
%Import Dataset<br />
import = importdata(strcat(path, filename));<br />
%If Doorsensor is not available, add a 0 column (for compatibility reasons)<br />
if length(import.data(1,:)) == 5;<br />
import.data(:,6) = 0;<br />
disp('No Doorsensor installed! => Column added');<br />
end<br />
%Create Datevector (as serial date number):<br />
date = datenum(import.textdata(:,1), 'dd-mm-yy HH:MM:SS');<br />
%Algorithm for interpolation<br />
%Definition of a second<br />
second = 1/(60*60*24);<br />
%Definition of a minute (for performance reasons)<br />
minute = 1/(60*24);<br />
%Current position within import-vector<br />
position = 1;<br />
%Length of the data-vector (for performance reasons)<br />
datalength = length(import.data(:,2));<br />
%New Matrix for the interpolated data: (Contains: Date/Time, Interpolated<br />
Temperature, Lower Border, Upper Border)<br />
ID = [];<br />
%next save positions (see below)<br />
saveposition = 250;<br />
disp(strcat('Start of Computation:_', datestr(now)));<br />
%Initialise time to first complete minute of imported data and the starting<br />
position;<br />
starttime = (date(position) - mod(date(position),minute)) + minute;<br />
while date(position + 1)
if ~isnan(import.data(i,6));<br />
dooropenings = dooropenings + import.data(i,6);<br />
else disp(strcat('NaN found at_: ', datestr(date(position))));<br />
end<br />
end<br />
for i = starttime:minute:date(position+jumplength);<br />
ID = [ID; [i, round(10 * interp1([date(position),<br />
date(position+jumplength)],[import.data(position,2),<br />
import.data(position+jumplength,2)],i,'linear'))/10,dooropenings,import.dat<br />
a(position,4),import.data(position,5)]];<br />
dooropenings = 0; %To make sure, that number of dooropenings is only<br />
added once<br />
starttime = starttime + minute;<br />
%Correct calculation mistakes<br />
if mod(starttime,minute) >= second;<br />
starttime = (starttime - mod(starttime,minute)) + minute;<br />
end;<br />
end<br />
position = position + jumplength;<br />
%store to disk, if next 250 positions are reached (performance reasons)<br />
if position >= saveposition;<br />
dlmwrite(strcat(path, filename, '- Interpolated.txt'),ID,<br />
'delimiter', ';', 'newline', 'pc', 'precision', '%.12f', '-append');<br />
ID = [];<br />
saveposition = saveposition + 250;<br />
end<br />
end<br />
%Save the rest<br />
dlmwrite(strcat(path, filename, '- Interpolated.txt'),ID, 'delimiter', ';',<br />
'newline', 'pc', 'precision', '%.12f', '-append');<br />
disp(strcat('End of Computation:_', datestr(now)));<br />
%Import file back from disk<br />
interpolation = importdata(strcat(path, filename, '- Interpolated.txt'));<br />
%Show Summary of the imported data<br />
%Count Dooropenings in Original File<br />
dooropenings = 0;<br />
for i = 1:length(import.data);<br />
if ~isnan(import.data(i,6));<br />
dooropenings = dooropenings + import.data(i,6);<br />
end<br />
end<br />
disp(strcat('Dooropenings (Original File):_', int2str(dooropenings)));<br />
disp(strcat('Dooropenings (Interpolated File):_',<br />
int2str(sum(interpolation(:,3)))));<br />
disp(strcat('Dataset Starting Time:_', datestr(interpolation(1,1))));<br />
disp(strcat('Dataset Ending Time:_',<br />
datestr(interpolation(length(interpolation),1))));<br />
A-114
Appendix 2 – Implementation of Statistical Methods<br />
This section will introduce the implementation of the suggested statistical data<br />
analysis. As described in section 6.1, the promising statistical measures are<br />
calculated on daily basis (whole day, daytime and nighttime). All results are exported<br />
to Microsoft Excel files to allow additional data analysis. Moreover, the graphs from<br />
chapter 6 are also plotted and saved to disk.<br />
Basic steps of this implementation are:<br />
1. Import of monitoring data and interpolated data<br />
2. Calculation of daily minimum and maximum (whole day, daytime, nighttime)<br />
(based on non-interpolated data)<br />
3. Calculation of daily mean, mode, median, standard deviation<br />
(whole day, daytime, nighttime) (based on interpolated data)<br />
4. Calculation of daily door openings (whole day, daytime, nighttime)<br />
5. Calculation of temperature distribution<br />
6. Creation of graphs<br />
7. Storage of calculated values and graphs to disk<br />
The actual implementation is printed on the following pages.<br />
A-115
%Name and location of the source file<br />
unit = 1;<br />
filename = strcat('Channel', int2str(unit), '.csv');<br />
path = 'C:\Dokumente und Einstellungen\Christian\Eigene<br />
Dateien\Dokumente\Studium\Diplomarbeit\Monitoring Data Nijmegen<br />
(Converted)\';<br />
%Import Dataset<br />
import = importdata(strcat(path, filename));<br />
%Create Datevector (as serial date number):<br />
date = datenum(import.textdata(:,1), 'dd-mm-yy HH:MM:SS');<br />
%Import Interpolated Data from Disk<br />
interpolation = importdata(strcat(path, filename, '- Interpolated.txt'));<br />
%Definition of a Second<br />
second = 1/(24*60*60);<br />
%Definition of a Minute (For Performance Reasons)<br />
minute = 1/(60*24);<br />
%Definition of Day- and Nighttime<br />
daybegin = (1/24)*6;<br />
dayend = (1/24)*22;<br />
%Start (Index of the imported data)<br />
%1 = Begin of imported file, add 1440 per Day<br />
start = 1 + 7*1440;<br />
%-----Minima & Maxima-----<br />
%(Use non interpolated data)<br />
%Auxiliary variable<br />
startposition = 1;<br />
while floor(date(startposition) + second) < floor(interpolation(start,1) +<br />
second);<br />
startposition = startposition + 1;<br />
end<br />
jumplength = 0;<br />
%Create Datevector for first column (per day)<br />
dailydate = floor(date(startposition) + second);<br />
%Create Minvector for second column (per day)<br />
minvector = [];<br />
%Create Mindaytimevector for third column (per day)<br />
mindaytime = [];<br />
%Create Minnighttimevector for forth column (per day)<br />
minnighttime = [];<br />
%The same for the maxima table<br />
maxvector = [];<br />
maxdaytime = [];<br />
maxnighttime = [];<br />
for i = startposition:length(date);<br />
if isequal(floor(date(startposition)+ second), floor(date(i) + second));<br />
jumplength = jumplength + 1;<br />
else<br />
A-116
dailydate = [dailydate; floor(date(i) + second)];<br />
%Vector for daily minimum & maximun<br />
minvector = [minvector; min(import.data(startposition:startposition +<br />
jumplength - 1,2))];<br />
maxvector = [maxvector; max(import.data(startposition:startposition +<br />
jumplength - 1,2))];<br />
%Compute Day- and Nighttime Values (Daytime: See Definiton of<br />
Daybegin & Dayend)<br />
nighttemp = [];<br />
daytemp = [];<br />
for j= startposition:startposition + jumplength - 1;<br />
if (mod(date(j),1) >= daybegin) && (mod(date(j),1) = daybegin) && (mod(date(j) + second, 1)<br />
xlswrite(strcat(path, 'Excel\', filename, '- Maxima'), [dailydate-693960,<br />
maxvector, maxdaytime, maxnighttime], 'Maxima', 'A2');<br />
%Total Values<br />
totalmin = min(import.data(:,2));<br />
totalmax = max(import.data(:,2));<br />
%-----Mean, Median, Mode, Standard Deviation-----<br />
%(Use interpolated data)<br />
%Create Date Vector<br />
interpolateddailydate = floor(interpolation(start,1) + second);;<br />
%create Mean Vectors<br />
meanvector = [];<br />
meandaytime = [];<br />
meannighttime = [];<br />
%create Median Vectors<br />
medianvector = [];<br />
mediandaytime = [];<br />
mediannighttime = [];<br />
%create Mode Vectors<br />
modevector = [];<br />
modedaytime = [];<br />
modenighttime = [];<br />
%create Standard Deviation Vectors<br />
stdvector = [];<br />
stddaytime = [];<br />
stdnighttime = [];<br />
%Create Vectors for Number of Dooropenings<br />
dailydooropenings = [];<br />
daytimedooropenings = [];<br />
nighttimedooropenings = [];<br />
%Auxiliary Variables<br />
startposition = start;<br />
jumplength = 0;<br />
for i = startposition:length(interpolation);<br />
if isequal(floor(interpolation(startposition,1) + second),<br />
floor(interpolation(i,1) + second));<br />
jumplength = jumplength + 1;<br />
else %This is called, when date changes...<br />
interpolateddailydate = [interpolateddailydate;<br />
floor(interpolation(i,1) + second)];<br />
%Vectors for Daily Values (Mean, Median, Mode, Standard Deviation)<br />
meanvector = [meanvector;<br />
mean(interpolation(startposition:startposition + jumplength - 1,2))];<br />
medianvector = [medianvector;<br />
median(interpolation(startposition:startposition + jumplength - 1,2))];<br />
modevector = [modevector;<br />
mode(interpolation(startposition:startposition + jumplength - 1,2))];<br />
stdvector = [stdvector; std(interpolation(startposition:startposition<br />
+ jumplength - 1,2))];<br />
A-118
%Count Dooropenings per Day<br />
dailydooropenings = [dailydooropenings;<br />
sum(interpolation(startposition:startposition + jumplength - 1,3))];<br />
%Compute Day- and Nighttime Values<br />
%(Mean, Median, Mode, Standard Deviation)<br />
nighttemp = [];<br />
daytemp = [];<br />
for j= startposition:startposition + jumplength - 1;<br />
if (mod(interpolation(j,1) + second, 1) >= daybegin) &&<br />
(mod(interpolation(j,1) + second, 1)
medianvector = [medianvector;<br />
median(interpolation(startposition:startposition + jumplength - 1,2))];<br />
modevector = [modevector; mode(interpolation(startposition:startposition +<br />
jumplength - 1,2))];<br />
stdvector = [stdvector; std(interpolation(startposition:startposition +<br />
jumplength - 1,2))];<br />
%Day- and Nighttime...<br />
nighttemp = [];<br />
daytemp = [];<br />
for j= startposition:startposition + jumplength - 1;<br />
if (mod(interpolation(j,1) + second, 1) >= daybegin) &&<br />
(mod(interpolation(j,1) + second, 1)
xlswrite(strcat(path, 'Excel\', filename, '- Median'),<br />
[interpolateddailydate-693960, round(medianvector*10)/10,<br />
round(mediandaytime*10)/10, round(mediannighttime*10)/10], 'Median', 'A2');<br />
xlswrite(strcat(path, 'Excel\', filename, '- Mode'),<br />
[interpolateddailydate-693960, round(modevector*10)/10,<br />
round(modedaytime*10)/10, round(modenighttime*10)/10], 'Mode', 'A2');<br />
xlswrite(strcat(path, 'Excel\', filename, '- Standard Deviation'),<br />
[interpolateddailydate-693960, round(stdvector*10)/10,<br />
round(stddaytime*10)/10, round(stdnighttime*10)/10], 'Standard Deviation',<br />
'A2');<br />
xlswrite(strcat(path, 'Excel\', filename, '- Doordopenings'),<br />
[interpolateddailydate-693960, dailydooropenings, daytimedooropenings,<br />
nighttimedooropenings], 'Dooropenings', 'A2');<br />
%Total Values<br />
totalmean = mean(interpolation(:,2));<br />
totalmedian = median(interpolation(:,2));<br />
totalmode = mode(interpolation(:,2));<br />
totalstd = std(interpolation(:,2));<br />
totaldooropenings = sum(interpolation(:,3));<br />
%-----Temperature Distribution-----<br />
%Count total occurrences of single values<br />
%"Round" Command necessary in MATLAB. Otherwise some comparisons fail!<br />
%Contains[Temperature, Minutes of Occurence]<br />
totalOC = [];<br />
for i = min(interpolation(:,2)):0.1:max(interpolation(:,2));<br />
totalOC = [totalOC; [i, sum(interpolation(:,2) == round(i*10)/10)]];<br />
end<br />
%-----Plot Statements-----<br />
%Temperature Overview<br />
plot(date, import.data(:,2), 'k');<br />
hold on;<br />
plot(interpolation(:,1), interpolation(:,4), '--b')<br />
plot(interpolation(:,1), interpolation(:,5), '--r')<br />
datetick('x',20, 'keeplimits');<br />
title 'Temperature Overview';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([min(date) max(date) min(import.data(:,2)) max(import.data(:,2))]);<br />
hold off;<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Temperature<br />
Overview.tif'));<br />
%Maximum Values per Day<br />
bar([dailydate(1):dailydate(length(maxvector))], maxvector, 'k');<br />
hold on;<br />
plot(interpolation(:,1), interpolation(:,4), '--b')<br />
plot(interpolation(:,1), interpolation(:,5), '--r')<br />
datetick('x',20, 'keeplimits');<br />
title 'Maximum Values per Day';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(maxvector)) min(maxvector)<br />
max(maxvector)]);<br />
A-121
hold off;<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Maximum Values per<br />
Day.tif'));<br />
%Maximum Values at Daytime<br />
bar([dailydate(1):dailydate(length(maxdaytime))], maxdaytime, 'k');<br />
hold on;<br />
plot(interpolation(:,1), interpolation(:,4), '--b')<br />
plot(interpolation(:,1), interpolation(:,5), '--r')<br />
datetick('x',20, 'keeplimits');<br />
title 'Maximum Values at Daytime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(maxvector)) min(maxvector)<br />
max(maxvector)]);<br />
hold off;<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Maximum Values at<br />
Daytime.tif'));<br />
%Maximum Values at Nighttime<br />
bar([dailydate(1):dailydate(length(maxnighttime))], maxnighttime, 'k');<br />
hold on;<br />
plot(interpolation(:,1), interpolation(:,4), '--b')<br />
plot(interpolation(:,1), interpolation(:,5), '--r')<br />
datetick('x',20, 'keeplimits');<br />
title 'Maximum Values at Nighttime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(maxvector)) min(maxvector)<br />
max(maxvector)]);<br />
hold off;<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Maximum Values at<br />
Nighttime.tif'));<br />
%Minimum Values per Day<br />
bar([dailydate(1):dailydate(length(minvector))], minvector, 'k');<br />
hold on;<br />
plot(interpolation(:,1), interpolation(:,4), '--b')<br />
plot(interpolation(:,1), interpolation(:,5), '--r')<br />
datetick('x',20, 'keeplimits');<br />
title 'Minimum Values per Day';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(minvector)) min(minvector)<br />
max(minvector)]);<br />
hold off;<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Minimum Values per<br />
Day.tif'));<br />
%Minimum Values at Daytime<br />
bar([dailydate(1):dailydate(length(mindaytime))], mindaytime, 'k');<br />
hold on;<br />
plot(interpolation(:,1), interpolation(:,4), '--b')<br />
plot(interpolation(:,1), interpolation(:,5), '--r')<br />
datetick('x',20, 'keeplimits');<br />
title 'Minimum Values at Daytime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(minvector)) min(minvector)<br />
max(minvector)]);<br />
hold off;<br />
A-122
print('-dtiff', strcat(path, 'Graphs\', filename, '- Minimum Values at<br />
Daytime.tif'));<br />
%Minimum Values at Nighttime<br />
bar([dailydate(1):dailydate(length(minnighttime))], minnighttime, 'k');<br />
hold on;<br />
plot(interpolation(:,1), interpolation(:,4), '--b')<br />
plot(interpolation(:,1), interpolation(:,5), '--r')<br />
datetick('x',20, 'keeplimits');<br />
title 'Minimum Values at Nighttime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(minvector)) min(minvector)<br />
max(minvector)]);<br />
hold off;<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Minimum Values at<br />
Nighttime.tif'));<br />
%Mean Values per Day<br />
bar([dailydate(1):dailydate(length(meanvector))], meanvector, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Mean Values per Day';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(meanvector)) min(meanvector)<br />
max(meanvector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Mean Values per<br />
Day.tif'));<br />
%Mean Values at Daytime<br />
bar([dailydate(1):dailydate(length(meandaytime))], meandaytime, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Mean Values at Daytime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(meanvector)) min(meanvector)<br />
max(meanvector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Mean Values at<br />
Daytime.tif'));<br />
%Mean Values at Nighttime<br />
bar([dailydate(1):dailydate(length(meannighttime))], meannighttime, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Mean Values at Nighttime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(meanvector)) min(meanvector)<br />
max(meanvector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Mean Values at<br />
Nighttime.tif'));<br />
%Median Values per Day<br />
bar([dailydate(1):dailydate(length(medianvector))], medianvector, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Median Values per Day';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(medianvector)) min(medianvector)<br />
max(medianvector)]);<br />
A-123
print('-dtiff', strcat(path, 'Graphs\', filename, '- Median Values per<br />
Day.tif'));<br />
%Median Values at Daytime<br />
bar([dailydate(1):dailydate(length(mediandaytime))], mediandaytime, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Median Values at Daytime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(medianvector)) min(medianvector)<br />
max(medianvector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Median Values at<br />
Daytime.tif'));<br />
%Median Values at Nighttime<br />
bar([dailydate(1):dailydate(length(mediannighttime))], mediannighttime,<br />
'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Median Values at Nighttime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(medianvector)) min(medianvector)<br />
max(medianvector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Median Values at<br />
Nighttime.tif'));<br />
%Mode Values per Day<br />
bar([dailydate(1):dailydate(length(modevector))], modevector, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Mode Values per Day';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(modevector)) min(modevector)<br />
max(modevector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Mode Values per<br />
Day.tif'));<br />
%Mode Values at Daytime<br />
bar([dailydate(1):dailydate(length(modedaytime))], modedaytime, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Mode Values at Daytime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(modevector)) min(modevector)<br />
max(modevector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Mode Values at<br />
Daytime.tif'));<br />
%Mode Values at Nighttime<br />
bar([dailydate(1):dailydate(length(modenighttime))], modenighttime, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Mode Values at Nighttime';<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(modevector)) min(modevector)<br />
max(modevector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Mode Values at<br />
Nighttime.tif'));<br />
%Standard Deviation per Day<br />
A-124
ar([dailydate(1):dailydate(length(stdvector))], stdvector, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title ({'Standard Deviation per Day'; strcat('(',<br />
num2str(round(totalmean*10)/10), '°C Mean Value)')});<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(stdvector)) min(stdvector)<br />
max(stdvector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Standard Deviation per<br />
Day.tif'));<br />
%Standard Deviation at Daytime<br />
bar([dailydate(1):dailydate(length(stddaytime))], stddaytime, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title ({'Standard Deviation at Daytime'; strcat('(',<br />
num2str(round(totalmean*10)/10), '°C Mean Value)')});<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(stdvector)) min(stdvector)<br />
max(stdvector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Standard Deviation at<br />
Daytime.tif'));<br />
%Standard Deviation at Nighttime<br />
bar([dailydate(1):dailydate(length(stdnighttime))], stdnighttime, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title ({'Standard Deviation at Nighttime'; strcat('(',<br />
num2str(round(totalmean*10)/10), '°C Mean Value)')});<br />
xlabel 'Date';<br />
ylabel 'Temperature (°C)';<br />
axis([dailydate(1) dailydate(length(stdvector)) min(stdvector)<br />
max(stdvector)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Standard Deviation at<br />
Nighttime.tif'));<br />
%Temperature Distribution<br />
bar(min(totalOC(:,1)):0.1:max(totalOC(:,1)), totalOC(:,2), 'k')<br />
title 'Total Occurence of Temperature Values';<br />
xlabel 'Temperature (°C)';<br />
ylabel 'Time (Minutes)';<br />
axis([totalOC(1,1) totalOC(length(totalOC),1) min(totalOC(:,2))<br />
max(totalOC(:,2))]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Total Occurence of<br />
Temperature Values.tif'));<br />
%If Doorsensor is installed...<br />
if max(dailydooropenings) > 0;<br />
%Dooropenings per Day<br />
bar([dailydate(1):dailydate(length(dailydooropenings))],<br />
dailydooropenings, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Dooropenings per Day';<br />
xlabel 'Date';<br />
ylabel 'Number of Dooropenings';<br />
axis([dailydate(1) dailydate(length(dailydooropenings))<br />
min(dailydooropenings) max(dailydooropenings)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Dooropenings per<br />
Day.tif'));<br />
%Dooropenings at Daytime<br />
A-125
ar([dailydate(1):dailydate(length(daytimedooropenings))],<br />
daytimedooropenings, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Dooropenings at Daytime';<br />
xlabel 'Date';<br />
ylabel 'Number of Dooropenings';<br />
axis([dailydate(1) dailydate(length(dailydooropenings))<br />
min(dailydooropenings) max(dailydooropenings)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Dooropenings at<br />
Daytime.tif'));<br />
%Dooropenings at Nighttime<br />
bar([dailydate(1):dailydate(length(nighttimedooropenings))],<br />
nighttimedooropenings, 'k');<br />
datetick('x',20, 'keeplimits');<br />
title 'Dooropenings at Nighttime';<br />
xlabel 'Date';<br />
ylabel 'Number of Dooropenings';<br />
axis([dailydate(1) dailydate(length(dailydooropenings))<br />
min(dailydooropenings) max(dailydooropenings)]);<br />
print('-dtiff', strcat(path, 'Graphs\', filename, '- Dooropenings at<br />
Nighttime.tif'));<br />
end<br />
A-126
Appendix 3 – Implementation of Data Mining Methods<br />
This section will introduce the implementation of the suggested data mining methods.<br />
As described in section 6.1.<br />
Basic steps of this implementation are:<br />
1. Import of monitoring data and interpolated data<br />
2. Determination of alarms (original limits, self-defined limits)<br />
3. Determination of alarms (with door opening recognition)<br />
4. Determination of alarm durations and the corresponding classification limits<br />
5. Determination maximum alarm temperatures and the corresponding<br />
classification limits<br />
6. Determination of the duration of door openings and the corresponding<br />
classification limits<br />
The actual implementation is printed on the following pages.<br />
A-127
%Name and location of the source file<br />
unit = 1;<br />
filename = strcat('Channel', int2str(unit), '.csv');<br />
path = 'C:\Dokumente und Einstellungen\Christian\Eigene<br />
Dateien\Dokumente\Studium\Diplomarbeit\Monitoring Data Nijmegen<br />
(Converted)\';<br />
%Import Dataset<br />
import = importdata(strcat(path, filename));<br />
%Create Datevector (as serial date number):<br />
date = datenum(import.textdata(:,1), 'dd-mm-yy HH:MM:SS');<br />
%Import Interpolated Data from Disk<br />
interpolation = importdata(strcat(path, filename, '- Interpolated.txt'));<br />
%Definition of a Second<br />
second = 1/(24*60*60);<br />
%Definition of a Minute (For Performance Reasons)<br />
minute = 1/(60*24);<br />
%Definition of Day- and Nighttime<br />
daybegin = (1/24)*6;<br />
dayend = (1/24)*22;<br />
%Start (Index of the imported data)<br />
%1 = Begin of imported file, add 1440 per Day<br />
start = 1 + 7*1440;<br />
%Self defined limits:<br />
upperLimit = 6;<br />
lowerLimit = 2;<br />
DoorOffset = 3; %Offset in Minutes<br />
%Determine Number of Occured Alarms<br />
%-----using the Original Limits!-----<br />
%Contains Date and Information, which Kind of Alarm<br />
%(1 = Above High Temperature Border; -1 = Below Low Temperature Border)<br />
%and Duration<br />
%Contains: [Date, Type of Alarm, Duration, Maximum Temperature]<br />
AlarmsOL = [];<br />
Alarmbefore = 0;<br />
Duration = 0;<br />
maxtemp = 0;<br />
for i = start:length(interpolation(:,1));<br />
if (interpolation(i,2) >= interpolation(i,5)) && (Alarmbefore == 0);<br />
%Get Duration<br />
k = i;<br />
while (k =<br />
interpolation(k,5));<br />
k = k + 1;<br />
end<br />
Duration = k - i;<br />
maxtemp = max(interpolation(i:k,2));<br />
AlarmsOL = [AlarmsOL; [interpolation(i,1), 1, Duration, maxtemp]];<br />
Alarmbefore = 1;<br />
A-128
Duration = 0;<br />
maxtemp = 0;<br />
elseif (interpolation(i,2)
Alarmbefore = 0;<br />
end<br />
end<br />
end<br />
%-----The Same Calculation with self defined Limits-----<br />
%-----(Ignore Alarms after Dooropenings in Offset Time)-----<br />
%Contains: [Date, Type of Alarm, Duration, Maximum Temperature]<br />
AlarmsDLNoDoor = [];<br />
Alarmbefore = 0;<br />
Duration = 0;<br />
maxtemp = 0;<br />
for i = (start + DoorOffset):length(interpolation(:,1))-1;<br />
%High-Temperature Alarm<br />
if (interpolation(i,2) >= upperLimit) & (Alarmbefore == 0);<br />
%Only, if there was no dooropening...<br />
if sum(interpolation(i-DoorOffset:i+1,3)) == 0;<br />
%Get Duration<br />
k = i;<br />
while (k =<br />
upperLimit);<br />
k = k + 1;<br />
end<br />
Duration = k - i;<br />
maxtemp = max(interpolation(i:k,2));<br />
AlarmsDLNoDoor = [AlarmsDLNoDoor; [interpolation(i,1), 1,<br />
Duration, maxtemp]];<br />
end<br />
Alarmbefore = 1;<br />
Duration = 0;<br />
maxtemp = 0;<br />
%Low-Temperature Alarm<br />
elseif (interpolation(i,2)
%Contains [Duration, Number of Occurences, Percentage, Accumulated<br />
%Percentage]<br />
DurationDL = [];<br />
OccurenceTemp = 0; %For Performance Reason<br />
for i = min(AlarmsDL(:,3)):max(AlarmsDL(:,3));<br />
OccurenceTemp = histc(AlarmsDL(:,3), round(i*10)/10);<br />
if isempty(DurationDL);<br />
DurationDL = [DurationDL; [i, OccurenceTemp,<br />
(OccurenceTemp/length(AlarmsDL(:,1)))*100,<br />
(OccurenceTemp/length(AlarmsDL(:,1)))*100]];<br />
OccurenceTemp = 0;<br />
else<br />
if OccurenceTemp > 0;<br />
DurationDL = [DurationDL; [i, OccurenceTemp,<br />
(OccurenceTemp/length(AlarmsDL(:,1)))*100,<br />
sum(DurationDL(1:length(DurationDL(:,1)),3)) +<br />
(OccurenceTemp/length(AlarmsDL(:,1)))*100]];<br />
end<br />
OccurenceTemp = 0;<br />
end<br />
end<br />
%-----Check Probability of Current Temperature (within Alarming Situations)<br />
%(Calculated by using the maximum values per alarm)<br />
%Contains [Maximum Temperature, Number of Occurences, Percentage,<br />
Accumulated<br />
%Percentage]<br />
ProbabilityDL = [];<br />
OccurenceTemp = 0; %For Performance Reason<br />
for i = min(AlarmsDL(:,4)):0.1:max(AlarmsDL(:,4));<br />
OccurenceTemp = histc(AlarmsDL(:,4), round(i*10)/10);<br />
if isempty(ProbabilityDL);<br />
ProbabilityDL = [ProbabilityDL; [i, OccurenceTemp,<br />
(OccurenceTemp/length(AlarmsDL(:,1)))*100,<br />
(OccurenceTemp/length(AlarmsDL(:,1)))*100]];<br />
OccurenceTemp = 0;<br />
else<br />
if OccurenceTemp > 0;<br />
ProbabilityDL = [ProbabilityDL; [i, OccurenceTemp,<br />
(OccurenceTemp/length(AlarmsDL(:,1)))*100,<br />
sum(ProbabilityDL(1:length(ProbabilityDL(:,1)),3)) +<br />
(OccurenceTemp/length(AlarmsDL(:,1)))*100]];<br />
end<br />
OccurenceTemp = 0;<br />
end<br />
end<br />
%-----Durations of Dooropenings-----<br />
%Contains: [Date, Duration (in seconds)]<br />
Dooropeningtime = [];<br />
%Get Startingposition for non interpolated data<br />
startposition = 1;<br />
while floor(date(startposition) + second) < floor(interpolation(start,1) +<br />
second);<br />
startposition = startposition + 1;<br />
end<br />
% Get Duration of Dooropenings<br />
A-131
for i = startposition:length(import.data(:,2)-1);<br />
if (import.data(i,6) == 1) & (import.data(i+1,6) == 0);<br />
Dooropeningtime = [Dooropeningtime; [date(i), round((date(i+1)-<br />
date(i))*60*60*24)]];<br />
else<br />
if (import.data(i,6) == 1) & (import.data(i+1,6) == 1);<br />
jumplength = 0;<br />
while (import.data(i+jumplength,6) == 1) & (i + jumplength 0;<br />
DoorProbability = [DoorProbability; [i, OccurenceTemp,<br />
(OccurenceTemp/length(Dooropeningtime(:,1)))*100,<br />
sum(DoorProbability(1:length(DoorProbability(:,1)),3)) +<br />
(OccurenceTemp/length(Dooropeningtime(:,1)))*100]];<br />
end<br />
OccurenceTemp = 0;<br />
end<br />
end<br />
%-----Display Information for a Certain Percentage-----<br />
Percentagelimit = 50;<br />
disp(strcat('Dooropening
disp(strcat('Dooropening
Erklärung (Statement)<br />
Diplomarbeit<br />
von : Christian Kaak<br />
Matr.Nr. : 2690287<br />
Thema:<br />
Ausfallprognosen mit Hilfe erweiterter Monitoring Systeme<br />
Ich versichere durch meine Unterschrift, dass ich die Arbeit selbständig und ohne Benutzung<br />
anderer als der angegebenen Hilfsmittel angefertigt habe. Alle Stellen, die wörtlich oder<br />
sinngemäß aus veröffentlichten oder unveröffentlichten Schriften entnommen sind, habe ich als<br />
solche kenntlich gemacht.<br />
Die Arbeit oder Auszüge daraus haben noch nicht in gleicher oder ähnlicher Form dieser oder<br />
einer anderen Prüfungsbehörde vorgelegen.<br />
Ich weiß, dass bei Abgabe einer falschen Versicherung die Diplom-Prüfung als nicht bestanden<br />
zu gelten hat.<br />
Braunschweig, 05.02.2007<br />
Unterschrift<br />
A-134