CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...
CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...
CRANFIELD UNIVERSITY Eleni Anthippi Chatzimichali ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3. Validation Techniques<br />
In the case of bootstrapping, a bootstrap training set D bootTrain is created by randomly<br />
picking samples with replacement from the training dataset D train . The total size of<br />
D bootTrain is equal to the size of D train . Since bootstrapping is based on sampling with<br />
replacement, any given sample could be present multiple times within the same<br />
bootstrap training set. The remaining samples not found in the bootstrap training set<br />
make up the bootstrap test set D bootTest . Similarly, for -fold cross-validation, the<br />
initial dataset D is partitioned into mutually exclusive folds; (10-fold<br />
cross-validation) was employed according to Section 1.6.2. In each iteration, a single<br />
fold will be used to form the test set D kfoldTest , while the remaining samples constitute<br />
the D kfoldTrain . In the ultimate case of LOOCV, D loocvTest consists of a single sample,<br />
while the remaining samples form D loocvTrain .<br />
4. Hyperparameter optimisation<br />
According to Section 1.5.2.3, nonlinear SVMs are usually considered a reasonable<br />
first choice. In the case of RBF models with bootstrapping, the SVMs are built and<br />
optimised using D bootTrain and D bootTest for different hyperparameter settings. More<br />
specifically, for each given combination of the hyperparameters and , a new SVM<br />
model is trained with D bootTrain and tested with D bootTest .<br />
The most intuitive and fairly naïve approach for parameter selection involves an<br />
exhaustive grid-search over an extensive range of hyperparameters. However, this is<br />
an extremely time-consuming and computationally intensive procedure, even if there<br />
is more than adequate processor power. Therefore, in this work, the parameter search<br />
was implemented based on the approach suggested by Hsu et al. (2003), also<br />
described in Meyer et al. (2003), in a two-step approach using a combination of a<br />
coarse and fine grid-search. Initially, the values of and increase exponentially<br />
with ranges equal to [ ] and [ ] respectively.<br />
The combination of hyperparameters that gives the highest overall classification<br />
accuracy is recorded as optimal. Once an optimal region is located on the grid, a finer<br />
grid-search is conducted in the “neighbourhood” of good parameters.<br />
40