Model Selection Based on the Modulus of Continuity

More documents

Recommendations

Info

similar complexity while the fourth function was more complex than other functions. Forthese target function, we generated N (= 50) training samples randomly according to thetarget function to train the estimation network. We also generated 300 test samples separately.For the estimation network, the basis functions of trigonometric polynomial networkwere given byφ 0 (x) = 1 2 , φ 2j−1(x) = sin jx, and φ 2j (x) = cos jx,where j represents the period of sinusoidal function. Here, the estimation function f n (x) wasgiven byn∑f n (x) = w k φ k (x). (31)k=0For N samples, the observation vector defined by y = (y 1 , · · · , y N ) T can be approximatedby the following vector form:y = Φ n w (32)where Φ n was a matrix in which the ij-th element was given by φ j (x i ) and w was a weightvector defined by w = (w 0 , · · · , w n ) T . From the empirical risk minimization of square lossfunction, the estimated weight vector ŵ could be determined byŵ = (Φ T nΦ n ) −1 Φ T ny. (33)By substituting the estimated weight vector to (32), we obtained the empirical risk R emp (f n )evaluated by the training samples and the estimated risk ̂R(f n ) could be determined by theAIC, BIC, SEB, and MC based methods. Here, the estimated optimal number of nodes wasdetermined bŷn = arg minn̂R(fn ). (34)Note that in the MC method, only the terms of R emp (f n ) and the modulus of continuityfor the estimation function were considered to select the optimal number of nodes sinceother terms in (19) were constant once the training samples were given. To compare the risk12
functions obtained for the estimated optimal number of nodes ̂n with the risk functions forthe minimum number of nodes obtained from the test samples, we computed the log ratioof two risks, that is,r R = logR(f̂n )min n R(f n )(35)where R(f n ) represented the risk function for the squared error loss function L(y, f n ) =(y − f n (x)) 2 . This risk ratio represented the quality of distance between the optimal and theestimated optimal risks. We also computed the log ratio of the estimated optimal number ofnodes ̂n to the minimum number of nodes obtained from the test patterns, that is,r n = loĝnarg min n R(f n ) . (36)This node ratio represented the quality of distance between the optimal and the estimatedoptimal complexity of the network. After all experiments have been repeated 1000 times, therisk ratios of (35) and the node ratios of (36) were plotted using the box-plot method. Thesimulation results of model selection using the AIC, BIC, SEB and MC based methods wereillustrated in Figures 2 through 9. These simulation results showed us that 1) the SEB basedmethod outperformed the AIC and BIC based methods from the view point of risk ratiosin the case of the first, second, and third target functions but not in the case of the fourthfunction, that is, more complicated function, 2) while the MC method demonstrated the toplevel performance from the view points of risk and node ratios for all four target functions.In general, the SEB method showed good performance when the ratio of the optimal numberof nodes to the number of samples n ∗ /l was small since the true risk bounds based on the VCdimension were derived in the sense of uniform convergence, that is, the worst case in thehypothesis space. As an example, we illustrated n ∗ /l for four target functions in Figure 10.As we expected, the fourth target function required high n ∗ /l compared to other targetfunctions due to the complexity of function as shown in Figure 1. In this case, the SEBmethod did not show good performance.13
Page 2: the trade-off between the under-fit
Page 10 and 11: Therefore, the upper bound of risk
Page 14 and 15: To see the risk prediction of the m
Page 16 and 17: 32.521.510.50AIC BIC SEB MC(a)32.52
Page 24 and 25: 0.80.60.40.20 0.025 0.05 0.1(a)0.80
Page 26 and 27: 10.90.8train errortest errorBICSEBA
Page 28 and 29: References1. Akaike, H.: Informatio
Page 30: Now, let us consider the target fun

Model Selection Based on the Modulus of Continuity

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?