12.07.2015 Views

Ivancevic_Applied-Diff-Geom

Ivancevic_Applied-Diff-Geom

Ivancevic_Applied-Diff-Geom

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

330 <strong>Applied</strong> <strong>Diff</strong>erential <strong>Geom</strong>etry: A Modern Introductioncase the MDL value becomes approximately equal to one half of the BICvalue.Probably the most desirable property of MDL over other selection methodsis that its complexity measure takes into account the effects of bothdimensions of model complexity, the number of parameters and functionalform. The MDL complexity measure, unlike CV, shows explicitly how bothfactors contribute to model complexity. In short, MDL is a sharper andmore accurate method than these three competitors. The price that is paidfor MDL’s superior performance is its computational cost. MDL can belaborious to calculate. First, the Fisher information matrix (3.160) mustbe obtained by calculating the second derivatives (i.e., Hessian matrix)of the log–likelihood function, ln f(y|w). This calculation can be non–trivial, though not impossible. Second, the square–root of the determinantof the Fisher information matrix must be integrated over the parameterspace. This generally requires use of a numerical integration method suchas Markov Chain Monte Carlo (see e.g., [Gilks et. al. (1996)]).3.11.4.5 Riemannian <strong>Geom</strong>etry of MinimumDescription LengthFrom a geometric perspective, a parametric model family of probability distributions(3.158) forms a Riemannian manifold embedded in the space ofall probability distributions (see [Amari (1985); Amari and Nagaoka (2000);McCullagh (1987)]). Every distribution is a point in this space, and thecollection of points created by varying the parameters of the model inducesa manifold in which ‘similar’ distributions are mapped to ‘nearby’ points.The infinitesimal distance between points separated by the infinitesimal parameterdifferences dw i is given byds 2 = g ij (w) dw i dw j ,where g ij (w) is the Riemannian metric tensor. The Fisher information,I ij (w), defined by (3.160), is the natural metric on a manifold of distributionsin the context of statistical inference [Amari (1985)]. We argue thatthe MDL measure of model fitness has an attractive interpretation in sucha geometric context.The first term in MDL equation (3.161) estimates the accuracy of themodel since the likelihood f(y|w ∗ ) measures the ability of the model to fitthe observed data. The second and third terms are supposed to penalizemodel complexity; we will show that they have interesting geometric inter-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!