13.08.2018 Views

[Studies in Computational Intelligence 481] Artur Babiarz, Robert Bieda, Karol Jędrasiak, Aleksander Nawrat (auth.), Aleksander Nawrat, Zygmunt Kuś (eds.) - Vision Based Systemsfor UAV Applications (2013, Sprin

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

266 H. Josiński et al.<br />

seen <strong>in</strong> Fig. 8, the results have once aga<strong>in</strong> improved noticeably. The maximum<br />

precision is 98.75% for a wide range Q values start<strong>in</strong>g from 0.76. It is always the<br />

same gait of the same subject, which is wrongly misclassified.<br />

Fig. 8. Naïve Bayes classification results for discretized spaces<br />

6 Feature Selection<br />

The separate challenge is the automatic feature selection task, the choice of<br />

the most remarkable features from the po<strong>in</strong>t of view of the given classification<br />

problem.<br />

The exhaustive search of the all possible comb<strong>in</strong>ations of the features subsets is<br />

unworkable because of the computation complexity, the problem is NP complete.<br />

For the simple case of 50 features there are 1.1259e+015 possible comb<strong>in</strong>ations to<br />

test. That is the reason why the approximate methods have to be used.<br />

There are two ma<strong>in</strong> approaches <strong>in</strong> feature selection: attributes rank<strong>in</strong>gs and<br />

search strategies based on the evaluation of the whole subsets [13]. In the rank<strong>in</strong>g<br />

approach we evaluate each of the attribute separately by the some k<strong>in</strong>d of the quality<br />

<strong>in</strong>dex and select the highest scored ones. The number of attributes to select<br />

could be specified or could be determ<strong>in</strong>ed based on the scores obta<strong>in</strong>ed. The<br />

crucial problem is the way attributes are evaluated. The most well-known rankers<br />

are based on the calculated statistics as for <strong>in</strong>stance GINI <strong>in</strong>dex, chi square test,<br />

Fisher ratio or depend on the determ<strong>in</strong>ed entropy as for <strong>in</strong>stance <strong>in</strong>formation ga<strong>in</strong><br />

and ga<strong>in</strong> ratio.<br />

However the assumption of existence only simple relations between s<strong>in</strong>gle<br />

attributes and class values is very naive. In many cases the worth of the attribute<br />

can be noticed only if considered with others. That is a reason why the methods<br />

with evaluation of whole feature subsets are ga<strong>in</strong><strong>in</strong>g more attention. They are<br />

more reliable and usually give more compact representation, with greater predictive<br />

abilities.<br />

In approach with whole subset evaluation there are two crucial issues. It is the<br />

search method which specifies how the feature space is explored and the way<br />

subset are evaluated.<br />

In most cases the greedy hill climb<strong>in</strong>g and genetic algorithms are used as search<br />

methods. In the hill climb<strong>in</strong>g exploration [13], we start with empty subset and <strong>in</strong><br />

the subsequent steps choose that attribute added to the currently determ<strong>in</strong>ed subset

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!