Anomaly Detection for Monitoring
anomaly-detection-monitoring
anomaly-detection-monitoring
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
As an aside, there’s a rumor going around that the Central Limit<br />
Theorem guarantees that samples from any population will be normally<br />
distributed, no matter what the population’s distribution is.<br />
This is a misreading of the theorem, and we assure you that machine<br />
data is not automatically Gaussian just because it’s obtained by sampling!<br />
Conclusions<br />
All anomaly detection relies on predicting an expected value or<br />
range of values <strong>for</strong> a metric, and then comparing observations to the<br />
predictions. The predictions rely on models, which can be based on<br />
theory or on empirical evidence. Models usually use historical data<br />
as inputs to derive the parameters that are used to predict the future.<br />
We discussed SPC techniques not only because they’re ubiquitous<br />
and very useful when paired with a good model (a theme we’ll<br />
revisit), but because they embody a thought process that is tremendously<br />
helpful in working through all kinds of anomaly detection<br />
problems. This thought process can be applied to lots of different<br />
kinds of models, including ARIMA models.<br />
When you model and predict some data in order to try to detect<br />
anomalies in it, you need to evaluate the quality of the results. This<br />
really means you need to measure the prediction errors—the residuals—and<br />
assess how good your model is at predicting the system’s<br />
data. If you’ll be using SPC to determine which observations are<br />
anomalous, you generally need to ensure that the residuals are normally<br />
distributed (Gaussian). When you do this, be sure that you<br />
don’t confuse the sample distribution with the population distribution!<br />
Conclusions | 31