Model Selection Based on the Modulus of Continuity
Model Selection Based on the Modulus of Continuity
Model Selection Based on the Modulus of Continuity
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
similar complexity while <strong>the</strong> fourth functi<strong>on</strong> was more complex than o<strong>the</strong>r functi<strong>on</strong>s. For<strong>the</strong>se target functi<strong>on</strong>, we generated N (= 50) training samples randomly according to <strong>the</strong>target functi<strong>on</strong> to train <strong>the</strong> estimati<strong>on</strong> network. We also generated 300 test samples separately.For <strong>the</strong> estimati<strong>on</strong> network, <strong>the</strong> basis functi<strong>on</strong>s <strong>of</strong> trig<strong>on</strong>ometric polynomial networkwere given byφ 0 (x) = 1 2 , φ 2j−1(x) = sin jx, and φ 2j (x) = cos jx,where j represents <strong>the</strong> period <strong>of</strong> sinusoidal functi<strong>on</strong>. Here, <strong>the</strong> estimati<strong>on</strong> functi<strong>on</strong> f n (x) wasgiven byn∑f n (x) = w k φ k (x). (31)k=0For N samples, <strong>the</strong> observati<strong>on</strong> vector defined by y = (y 1 , · · · , y N ) T can be approximatedby <strong>the</strong> following vector form:y = Φ n w (32)where Φ n was a matrix in which <strong>the</strong> ij-th element was given by φ j (x i ) and w was a weightvector defined by w = (w 0 , · · · , w n ) T . From <strong>the</strong> empirical risk minimizati<strong>on</strong> <strong>of</strong> square lossfuncti<strong>on</strong>, <strong>the</strong> estimated weight vector ŵ could be determined byŵ = (Φ T nΦ n ) −1 Φ T ny. (33)By substituting <strong>the</strong> estimated weight vector to (32), we obtained <strong>the</strong> empirical risk R emp (f n )evaluated by <strong>the</strong> training samples and <strong>the</strong> estimated risk ̂R(f n ) could be determined by <strong>the</strong>AIC, BIC, SEB, and MC based methods. Here, <strong>the</strong> estimated optimal number <strong>of</strong> nodes wasdetermined bŷn = arg minn̂R(fn ). (34)Note that in <strong>the</strong> MC method, <strong>on</strong>ly <strong>the</strong> terms <strong>of</strong> R emp (f n ) and <strong>the</strong> modulus <strong>of</strong> c<strong>on</strong>tinuityfor <strong>the</strong> estimati<strong>on</strong> functi<strong>on</strong> were c<strong>on</strong>sidered to select <strong>the</strong> optimal number <strong>of</strong> nodes sinceo<strong>the</strong>r terms in (19) were c<strong>on</strong>stant <strong>on</strong>ce <strong>the</strong> training samples were given. To compare <strong>the</strong> risk12