13.07.2015 Views

Thesis - Instituto de Telecomunicações

Thesis - Instituto de Telecomunicações

Thesis - Instituto de Telecomunicações

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

5.3. SEQUENTIAL CLASSIFICATION 101procedures that will be <strong>de</strong>tailed next.Parallel Feature Selection. Given that the feature search for each user is in<strong>de</strong>pen<strong>de</strong>nt,a parallel algorithm was constructed that was easily <strong>de</strong>ployed in several machines that sharesome data files (we used a total of 4 computers with similar computing power).Recursive Increasing Precision Feature Selection. To obtain an estimate of theerror of the classifier, a set of runs must be performed correspondig to the application of theclassifier to a set of test samples. The number of wrapper tests directly influences the timespent in classification, and the precision of the reported classification error. The time andprecision grow linearly with the number of performed tests. In each step of the SequentialForward Search (SFS) feature selection method the goal is to find the best feature of thefeature vector. With a reduced test set a feature subset can be immediately eliminatedwithout further inspection, being rejected and leaving a reduced set of features for furtherstudy with longer tests sets. This procedure is run 3 times with increased number of testssamples obtaining suitable precision for the classifier error for a reduced set of features.The concerns expressed in this subsection were focused on the time complexity, that wasthe limitation found during the implementation of the algorithms. The space complexity islinear with the number of users and features.5.3 Sequential ClassificationThe behavioral data used in this thesis, in particular the EDA data, presents significantclass overlap. This discriminating capacity difficulty is intrinsic to the data, <strong>de</strong>spite theun<strong>de</strong>rtaken mo<strong>de</strong>ling and feature selection efforts.In or<strong>de</strong>r to overcome the problem of high classification error probability due to classoverlap, we propose a classifier using several samples of the same source in a sequentialclassification <strong>de</strong>cision rule.Consi<strong>de</strong>r as a starting point the MAP classifier <strong>de</strong>fined by [76]:<strong>de</strong>ci<strong>de</strong> w i if i = argmax i (p(w i |x)) (5.6)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!