11.07.2015 Views

A Tutorial on Support Vector Machines for Pattern Recognition

A Tutorial on Support Vector Machines for Pattern Recognition

A Tutorial on Support Vector Machines for Pattern Recognition

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

29Figure 11. Gaussian RBF SVMs of suciently small width can classify an arbitrarily large number oftraining points correctly, and thus have innite VC dimensi<strong>on</strong>Now we are left with a striking c<strong>on</strong>undrum. Even though their VC dimensi<strong>on</strong> is innite (ifthe data is allowed to take all values in R dL ), SVM RBFs can have excellent per<strong>for</strong>mance(Scholkopf et al, 1997). A similar story holds <strong>for</strong> polynomial SVMs. How come?7. The Generalizati<strong>on</strong> Per<strong>for</strong>mance of SVMsIn this Secti<strong>on</strong> we collect various arguments and bounds relating to the generalizati<strong>on</strong> per<strong>for</strong>manceof SVMs. We start by presenting a family of SVM-like classiers <strong>for</strong> which structuralrisk minimizati<strong>on</strong> can be rigorously implemented, and which will give ussomeinsightastowhy maximizing the margin is so important.7.1. VC Dimensi<strong>on</strong> of Gap Tolerant ClassiersC<strong>on</strong>sider a family of classiers (i.e. a set of functi<strong>on</strong>s <strong>on</strong> R d ) which we will call \gaptolerant classiers." A particular classier 2 is specied by the locati<strong>on</strong> and diameterof a ball in R d , and by two hyperplanes, with parallel normals, also in R d . Call the set ofpoints lying between, but not <strong>on</strong>, the hyperplanes the \margin set." The decisi<strong>on</strong> functi<strong>on</strong>s are dened as follows: points that lie inside the ball, but not in the margin set, are assignedclass f1g, depending <strong>on</strong> which side of the margin set they fall. All other points are simplydened to be \correct", that is, they are not assigned a class by the classier, and do notc<strong>on</strong>tribute to any risk. The situati<strong>on</strong> is summarized, <strong>for</strong> d = 2, in Figure 12. This ratherodd family of classiers, together with a c<strong>on</strong>diti<strong>on</strong> we will impose <strong>on</strong> how they are trained,will result in systems very similar to SVMs, and <strong>for</strong> which structural risk minimizati<strong>on</strong> canbe dem<strong>on</strong>strated. A rigorous discussi<strong>on</strong> is given in the Appendix.Label the diameter of the ball D and the perpendicular distance between the two hyperplanesM. The VC dimensi<strong>on</strong> is dened as be<strong>for</strong>e to be the maximum number of points thatcan be shattered by the family, butby \shattered" we mean that the points can occur aserrors in all possible ways (see the Appendix <strong>for</strong> further discussi<strong>on</strong>). Clearly we can c<strong>on</strong>trolthe VC dimensi<strong>on</strong> of a family of these classiers by c<strong>on</strong>trolling the minimum margin Mand maximum diameter D that members of the family are allowed to assume. For example,c<strong>on</strong>sider the family of gap tolerant classiers in R 2 with diameter D =2,shown in Figure12. Those with margin satisfying M 3=2 can shatter three points if 3=2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!