12.07.2015 Views

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Case Study V in Chapter 8 showed the same performance of mutual information calculations comparedto the popular Relief algorithm (see comparison results in Figures 8.2 <strong>and</strong> 8.3). These slightimprovements, plus its simpler calculation, lead to mutual information being chosen for use with thethesis data.9.3. ContributionsFrom the academic point of view, the main contributions in this thesis can be identified as follows:The implementation of modeling the cardiovascular data based on clinical knowledge.Investigating <strong>and</strong> implementing the use of pattern recognition <strong>and</strong> data mining techniques insteadof the use of other methods such as POSSUM <strong>and</strong> PPOSSUM.The investigation <strong>and</strong> verification of a data mining methodology for evaluating individual riskprediction in alternative risk prediction models by <strong>using</strong> alternative pattern recognition <strong>and</strong> datamining techniques.The improvement of K-means algorithm, as KMIX, to use alternative attribute types in the datadomain.The definition of a calculation based on mutual information <strong>and</strong> Bayes‟ theorem, <strong>and</strong> its use asattribute weights in the WKMIX algorithm.<strong>Data</strong> from both clinical sites (Hull <strong>and</strong> Dundee) is viewed <strong>using</strong> 6 clinical models based on clinicalexpert advice. Three other scoring risk models were built based on the Hull site data. The clinicalmodel outcomes for individual patients are labeled via heuristic formulas (see section 6.3 in Chapter 6)whereas the scoring risk outcomes are based on the model threshold values (see Table 6.6 in Chapter6). Model CM2 was derived from the model CM1 with different outcome set (based on the “PATIENTSTATUS” <strong>and</strong> “30D stroke/death” attributes). The CM3a <strong>and</strong> CM4a outcomes are derived from CM2outcomes with smaller input sets. The other models, CM3b <strong>and</strong> CM4b, differ to CM3a <strong>and</strong> CM4a in anexpansion of the scale for the outcomes. These alternative outcomes were used here with the hope todetermine more detailed risk predictions for individual patients. However, all the above models seem tobe fail, as indicated by the poor performances in the supervised classifiers <strong>and</strong> the high gaps of thesensitivity rates versus positive predictive values (for “High risk” predictions) in all thesis experiments.The suggested reason for these failures is the nature of the problem <strong>and</strong> the difficulty of measuringinfluential parameters for the models.POSSUM <strong>and</strong> PPOSSUM can predict individual risk for patients via the mortality, morbidity, <strong>and</strong>death rate scores. The numeric output moving from 0 (0%) to 1 (100%) supported the patient risks143

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!