13.07.2015 Views

View - Statistics - University of Washington

View - Statistics - University of Washington

View - Statistics - University of Washington

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

15thogonal distance to the curve), while the V Along term measures the variance in arclength distances between projection points on the curve. Minimizing ∑ V ∗ (wherethe sum is over all clusters) will lead to clusters with points regularly spaced alongthe curve and tightly grouped around it. Large values <strong>of</strong> α will cause the algorithmto avoid clusters with gaps, while small values will favor thinner clusters.Clustering stops when merging clusters would lead to an increase in ∑ V ∗ .We extend the method to open principal curves by changing V Along so that thesum goes only to (N − 1) instead <strong>of</strong> to N. This is because the closed curves couldwrap around, whereas the open curve stops at its end points.Overview <strong>of</strong> HPCC:1. Make a first estimate <strong>of</strong> the noise points and remove them.2. Form an initial clustering with at least seven points in each cluster.3. Fit a principal curve to each cluster.4. Calculate ∑ V ∗ for each possible merge.5. Perform the merge which leads to the lowest ∑ V ∗ .6. Keep merging until the desired number <strong>of</strong> clusters is reached.Deciding when to stop clustering is more difficult for open curves than forclosed curves. In the closed curve case, clustering stops when any merge wouldlead to an increase in ∑ V ∗ (Banfield and Raftery, 1993). For open curves, thismethod leads to an overfitting problem in which we end up with too many clusters.V ∗ can be made arbitrarily close to zero by increasing the number <strong>of</strong> clusters. Weovercame this problem by using approximate Bayes factors (Section 2.2.4).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!