11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

64 Statistics before and after COPSS Prizewhich could work well in the presence of bias (Bickel, 1981) although I was farfrom putting this work in its proper context. Their work in this area, earlierwork on detecting sparse objects (Donoho et al., 1992), and earlier work ofStone (1977) made it apparent that, without much knowledge of a statisticalproblem, minimax bounds indicated that nothing much could be achieved ineven moderate dimensions.On the other hand, a branch of computer science, machine learning, haddeveloped methodology such as neural nets, and on the statistics side, LeoBreiman and Jerry Friedman, working with Richard Olshen and CharlesStone, developed CART. Both of these methods of classification use very highdimensional predictors and, relative to the number of predictors, small trainingsets. These methods worked remarkably well, far better than the minimaxtheory would lead us to believe. These approaches and a plethora ofother methods developed in the two communities, such as Boosting, RandomForests, and above all “lasso” driven methods involve, implicitly or explicitly,“regularization,” which pulls solutions of high dimensional optimizationproblems towards low dimensional spaces.In many situations, while we know little about the problem, if we canassume that, in an appropriate representation, only a relatively few majorfactors matter, then theorists can hope to reconcile the “Curse of Dimensionality”minimax results with the observed success of prediction methods basedon very high dimensional predictor sets.Under the influence of Leo Breiman I became very aware of these developmentsand started to contribute, for instance, to the theory of boosting inBickel et al. (2006). I particularly liked a simple observation with Bo Li, growingout of my Rietz lecture (Bickel and Li, 2007). If predictors of dimensionp are assumed to lie on an unknown smooth d-dimensional manifold of R pwith d ≪ p, thenthedifficultyofthenonparametricregressionproblemisgoverned not by p but by d, providedthatregularizationisdoneinasuitablydata-determined way; that is, bandwidth selection is done after implicit orexplicit estimation of d.6.4.3 Estimating high dimensional objectsMy views on the necessary existence of low dimensional structure were greatlystrengthened by working with Elizaveta Levina on her thesis. We worked withJitendra Malik, a specialist in computer vision, and his students, first in analyzingan algorithm for texture reconstruction developed by his then student,Alexei Effros, and then in developing some algorithms for texture classification.The first problem turned out to be equivalent to a type of spatialbootstrap. The second could be viewed as a classification problem based onsamples of 1000+ dimensional vectors (picture patches) where the goal wasto classify the picture from which the patches were taken into one of severalclasses.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!