12.03.2016 Views

Anomaly Detection for Monitoring

anomaly-detection-monitoring

anomaly-detection-monitoring

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Evaluating Predictions<br />

One of the most important and subtle parts of anomaly detection<br />

happens at the intersection between predicting how a metric should<br />

behave, and comparing observed values to those expectations.<br />

In anomaly detection, you’re usually using many standard deviations<br />

from the mean as a replacement <strong>for</strong> very unlikely, and when you get<br />

far from the mean, you’re in the tails of the distribution. The fit<br />

tends to be much worse here than you’d expect, so even small deviations<br />

from Gaussian can result in many more outliers than you theoretically<br />

should get.<br />

Similarly, a lot of statistical tests such as hypothesis tests are deemed<br />

to be “significant” or “good” based on what turns out to be statistician<br />

rules of thumb. Just because some p-value looks really good<br />

doesn’t mean there’s truly a lot of certainty. “Significant” might not<br />

signify much. Hey, it’s statistics, after all!<br />

As a result, there’s a good chance your anomaly detection techniques<br />

will sometimes give you more false positives than you think they<br />

will. These problems will always happen; this is just par <strong>for</strong> the<br />

course. We’ll discuss some ways to mitigate this in later chapters.<br />

Common Myths About Statistical <strong>Anomaly</strong><br />

<strong>Detection</strong><br />

We commonly hear claims that some technique, such as SPC, won’t<br />

work because system metrics are not Gaussian. The assertion is that<br />

the only workable approaches are complicated non-parametric<br />

methods. This is an oversimplification that comes from confusion<br />

about statistics.<br />

Here’s an example. Suppose you capture a few observations of a<br />

“mystery time series.” We’ve plotted this in Figure 3-6.<br />

Evaluating Predictions | 27

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!