13.07.2015 Views

Model Selection Based on the Modulus of Continuity

Model Selection Based on the Modulus of Continuity

Model Selection Based on the Modulus of Continuity

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

To see <strong>the</strong> risk predicti<strong>on</strong> <strong>of</strong> <strong>the</strong> model selecti<strong>on</strong> methods, we plotted <strong>the</strong> estimated errorversus <strong>the</strong> number <strong>of</strong> nodes for <strong>the</strong> AIC, BIC, and SEB methods in <strong>the</strong> L 2 sense <strong>of</strong> lossfuncti<strong>on</strong>, that is, <strong>the</strong> square root <strong>of</strong> risk functi<strong>on</strong>, and also for <strong>the</strong> MC method in <strong>the</strong> L 1sense <strong>of</strong> loss functi<strong>on</strong> defined by (16). The predicted results were compared with test errors asshown in Figures 11 and 12. Theses results showed that 1) <strong>the</strong> risk predicti<strong>on</strong> using <strong>the</strong> AICand BIC methods fit well except <strong>the</strong> sudden change in risk functi<strong>on</strong>s, 2) <strong>the</strong> risk predicti<strong>on</strong>using <strong>the</strong> SEB method had <strong>the</strong> tendency to fit well when <strong>the</strong> ratio <strong>of</strong> <strong>the</strong> number <strong>of</strong> nodes to<strong>the</strong> number <strong>of</strong> samples n/l was small, and 3) <strong>the</strong> risk predicti<strong>on</strong> using <strong>the</strong> MC method waswell suited with <strong>the</strong> test errors in overall range <strong>of</strong> <strong>the</strong> number <strong>of</strong> nodes. In <strong>the</strong>se predictedresults, it was interesting that <strong>the</strong> MC method was able to catch <strong>the</strong> sudden change <strong>of</strong> testerrors while o<strong>the</strong>r methods didn’t.In summary, <strong>the</strong> performance <strong>of</strong> <strong>the</strong> MC based method showed <strong>the</strong> better performancefor various types <strong>of</strong> target functi<strong>on</strong>s from <strong>the</strong> view points <strong>of</strong> risk and node ratios compared too<strong>the</strong>r methods. We also dem<strong>on</strong>strated that <strong>the</strong> risk predicti<strong>on</strong> using <strong>the</strong> MC method was ableto catch <strong>the</strong> trend <strong>of</strong> test errors. This was mainly due to <strong>the</strong> fact that <strong>the</strong> MC based methodwas performed using <strong>the</strong> risk functi<strong>on</strong> bounds incorporating <strong>the</strong> informati<strong>on</strong> <strong>of</strong> learned resultssuch as <strong>the</strong> sum <strong>of</strong> absolute weights as well as <strong>the</strong> structural informati<strong>on</strong> <strong>of</strong> <strong>the</strong> estimati<strong>on</strong>network. Fur<strong>the</strong>rmore, <strong>the</strong> suggested MC method can be easily extended to various types <strong>of</strong>regressi<strong>on</strong> models with n<strong>on</strong>linear kernel functi<strong>on</strong>s which have some smoothness c<strong>on</strong>straints.14

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!