Anomaly Detection for Monitoring
anomaly-detection-monitoring
anomaly-detection-monitoring
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
lem in your system, that’s great. Go ahead and alert on it. But otherwise,<br />
we suggest that you don’t alert on things that may have no<br />
impact or consequence.<br />
Instead, we suggest that you record these anomalous observations,<br />
but don’t alert on them. Now you have essentially created an index<br />
into the most unusual data points in your metrics, <strong>for</strong> later use in<br />
case it is interesting. For example, during diagnosis of a problem<br />
that you have detected.<br />
One of the assumptions embedded in this recommendation is that<br />
anomaly detection is cheap enough to do online in one pass as data<br />
arrives into your monitoring system, but that ad hoc, after-the-fact<br />
anomaly detection is too costly to do interactively. With the monitoring<br />
data sizes that we are seeing in the industry today, and the<br />
attitude that you should “measure everything that moves,” this is<br />
generally the case. Multi-terabyte anomaly detection analysis is usually<br />
unacceptably slow and requires more resources than you have<br />
available. Again, we are placing this in the context of what most of<br />
us are doing <strong>for</strong> monitoring, using typical open-source tools and<br />
methodologies.<br />
Conclusions<br />
Although it’s easy to get excited about success stories in anomaly<br />
detection, most of the time someone else’s techniques will not translate<br />
directly to your systems and your data. That’s why you have to<br />
learn <strong>for</strong> yourself what works, what’s appropriate to use in some situations<br />
and not in others, and the like.<br />
Our suggestion, which will frame the discussion in the rest of this<br />
book, is that, generally speaking, you probably should use anomaly<br />
detection “online” as your data arrives. Store the results, but don’t<br />
alert on them in most cases. And keep in mind that the map is not<br />
the territory: the metric isn’t the system, an anomaly isn’t a crisis,<br />
three sigmas isn’t unlikely, and so on.<br />
14 | Chapter 2: A Crash Course in <strong>Anomaly</strong> <strong>Detection</strong>