12.07.2015 Views

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The results are then divided to different groups in the range from 0%-100%. For each group a predictedmean is calculated in order to calculate the number of predicted “mortality” or “death rate” (see inTables C2, <strong>and</strong> C3 below).Range of Mean predicted risk No of Predicted Reportedpredicted rate of Mortality (%) operations deaths deathsThe ratio0-10% 3.00% 438 13 60 4.6210-20% 13.48% 39 5 9 1.8020-30% 23.25% 13 3 3 1.0030-40% 32.27% 5 2 4 2.0040-50% 44.86% 3 1 2 2.00>50% 58.37% 1 1 - 0.000-100% 5% 498 25 78 3.12Table C3: Comparison of observed <strong>and</strong> predicted death from PPOSSUM logistic equations.Step 4 (Comparison/ Evaluation): The comparisons are fulfilled based on the ratios between thepredicted <strong>and</strong> actual rates. For example, the b<strong>and</strong> group of 20%-30% in Table C2 shows that, thepredicted mortality is calculated based on the mean (24.97%) <strong>and</strong> the number of operation cases (44).Therefore, the ratio between the reported mortality (8) <strong>and</strong> the predicted one (11) is 0.73.C.2. Case study IIClinical Model CM3aDStep 1 (Selection): The data is taken from the Dundee site with a selection of 18 attributes (16 inputattributes, <strong>and</strong> 2 attributes are for the outcome calculations) <strong>and</strong> 341 patients.Step 2 (Clean/Transform/Filter): The method used here is followed the methods indicated in “<strong>Data</strong>Preparation Strategy” section in Chapter 5. The detail as follows:Cleaning task: The summary for this task can be seen in Table C4 below. For example, thenumber of missing values for attribute namely “OP DURATION” is 72. The values are in therange of [0.7, 3]. Therefore, missing values will be filled by the mean of non-missing values(1.50). The number of missing values is 10 in the attribute “PATCH”, <strong>and</strong> the most frequencyvalue of “PTFE” is 170. Therefore, attribute missing values will be filled as “PTFE”.Transformation task: All data is required to transform to numerical values. Therefore,continuous values are rescaled to the values in the range of [0,1] by <strong>using</strong> normalisationmethod. Boolean values are transformed to values of 0 or 1 respectively. Categorical values are171

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!