12.07.2015 Views

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

25 input attributes to 16 input attributes). Therefore, the selection of what are thought to be significantattributes does not improve classification performances. Furthermore, the poor gap between correctpredicted “High risk” (sensitivity) <strong>and</strong> correct predictive “High risk” (positive predictive value)persists. Again, the nature of the problem, the difficulty of measuring influential parameters, <strong>and</strong> theresultant poor mapping between input attributes <strong>and</strong> outcomes, is suspected.7.2.3. Clinical Models CM3b <strong>and</strong> CM4bThe results of two models CM3b <strong>and</strong> CM4b can be seen in Table 7.3. These models share the sameinput attribute sets as in models CM3a <strong>and</strong> CM4a respectively. However, the expected output sets areexp<strong>and</strong>ed <strong>using</strong> alternative risk categories such as “Very High risk”; “High risk”; “Medium risk”; <strong>and</strong>“Low risk”. The hope is that the expansion of categorical risks will show an improvement in theclassification results.The evaluation measures here are based on confusion matrix with an assumption that the number of“Very High risk”, “High risk”, <strong>and</strong> “Medium risk” are referred to as the number of positive outcomes,<strong>and</strong> the number of “Low risk” is referred to as the number of negative outcomes. The networktopologies <strong>and</strong> their parameters are the same as in the CM3a <strong>and</strong> CM4a experiments except theincreased number of output nodes (4). This is required by the binary representation for the categoricaloutcomes of “Very High risk”; “High risk”; “Medium risk”; <strong>and</strong> “Low risk”. For example, themultilayer perceptron used with models CM3b <strong>and</strong> CM4b has topologies of 16-2-4 (16 input nodes; 2hidden nodes; <strong>and</strong> 4 output nodes) <strong>and</strong> 14-2-4 (14 input nodes; 2 hidden nodes; <strong>and</strong> 4 output nodes).Surprisingly from Table 7.3, all expected “Medium risk” patients are predicted into “Low risk” classexcept for the classifiers CM3b-SVM (only one pattern correctly falls into “Medium risk” class) <strong>and</strong>CM4b-MLP (one pattern falls into “High risk” class). This suggests that the combinations of datamodels <strong>and</strong> classifiers support just three levels of risks.The sensitivity rates <strong>and</strong> positive predictive values of all classifiers are very poor (an average of 0.06<strong>and</strong> 0.22 respectively except the classifiers CM3b-RBF <strong>and</strong> CM4b-RBF). Furthermore, the distancesbetween these rates are not reduced. Therefore, the expansion of outcome risk labelling does not help toclarify <strong>and</strong> improve the classification results in particular with regard to “High risk” predictions.110

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!