12.03.2016 Views

Anomaly Detection for Monitoring

anomaly-detection-monitoring

anomaly-detection-monitoring

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Clustering<br />

Not all anomaly detection is based on time series of metrics. Clustering,<br />

or cluster analysis is one way of grouping elements together<br />

to try to find the odd ones out. Netflix has written about their<br />

anomaly detection methods based on cluster analysis. 2 They apply<br />

cluster analysis techniques on server clusters to identify anomalous,<br />

misbehaving, or underper<strong>for</strong>ming servers.<br />

K-Means clustering is a common algorithm that’s fairly simple to<br />

implement. Here’s an example:<br />

Non-Parametric Analysis<br />

Not all anomaly detection techniques need models to draw useful<br />

conclusions about metrics. Some avoid models altogether! These are<br />

called non-parametric anomaly detection methods, and use theory<br />

from a larger field called non-parametric statistics.<br />

The Kolmogorov-Smirnov test is one non-parametric method that<br />

has gained popularity in the monitoring community. It tests <strong>for</strong><br />

changes in the distributions of two samples. An example of a type of<br />

question that it can answer is, “is the distribution of CPU usage this<br />

week significantly different from last week?” Your time intervals<br />

don’t necessarily have to be as long as a week, of course.<br />

We once learned an interesting lesson while trying to solve a sticky<br />

problem with a non-Gaussian distribution of values. We wanted to<br />

2 “Tracking down the Villains: Outlier <strong>Detection</strong> at Netflix”<br />

56 | Chapter 6: The Broader Landscape

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!