11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

318 Anon-asymptoticwalk28.4 Beyond Talagrand’s inequalityTalagrand’s new result for empirical processes stimulated intense research,part of which was aimed at deriving alternatives to Talagrand’s original approach.The interested reader will find in Boucheron et al. (2013) an accountof the transportation method and of the so-called entropy method that wedeveloped in a series of papers (Boucheron et al., 2000, 2003, 2005; Massart,2000) in the footsteps of Michel Ledoux (1996). In particular, using the entropymethod established by Olivier Bousquet (2002), we derived a versionof Talagrand’s inequality for empirical processes with optimal numerical constantsin the exponential bound.Model selection issues are still posing interesting challenges for empiricalprocess theory. In particular, the implementation of non-asymptotic penalizationmethods requires data-driven penalty choice strategies. One possibilityis to use the concept of “minimal penalty” that Lucien Birgé and I introducedin Birgé and Massart (2007) in the context of Gaussian model selectionand, more generally, the “slope heuristics” (Arlot and Massart, 2009), whichbasically relies on the idea that the empirical lossγ n (s) − γ n (ŝ m )= supt∈S m{γ n (s) − γ n (t)}has a typical behavior for large dimensional models. A complete theoreticalvalidation of these heuristics is yet to be developed but partial results areavailable; see, e.g., Arlot and Massart (2009), Birgé andMassart(2007),andSaumard (2013).A fairly general concentration inequality providing a non-asymptotic analogueto Wilks’ Theorem is also established in Boucheron and Massart (2011)and used in Arlot and Massart (2009). This result stems from the entropymethod, which is flexible enough to capture the following rather subtle selflocalizationeffect. The variance of sup t∈Sm {γ n (s) − γ n (t)} can be proved tobe of the order of the variance of γ n (s) − γ n (t) at t = ŝ m ,whichmaybemuch smaller than the maximal variance. This is typically the quantity thatwould emerge from a direct application of Talagrand’s inequality for empiricalprocesses.The issue of calibrating model selection criteria from data is of great importance.In the context where the list of models itself is data dependent(think, e.g., of models generated by variables selected from an algorithm suchas LARS), the problem is related to the equally important issue of choosingregularization parameters; see Meynet (2012) for more details. This is anew field of investigation which is interesting both from a theoretical and apractical point of view.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!