12.07.2015 Views

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

(16 input nodes; 0 hidden nodes; 3 output nodes as three level of risks) for CM3bDC, learning rate=0.3, <strong>and</strong> number of cycles= 100 epochs; radial basis function has number of centre c of 2; <strong>and</strong>support vector machine is used with poly kernel function, <strong>and</strong> exponent parameter p=2. The 10-foldcross-validation is used. The results can be seen in Tables C12 <strong>and</strong> C13.C.5. Chapter 7 ExperimentsC.5.1. Clinical Models CM1 <strong>and</strong> CM2Step 1 (Selection): The data structure here includes 26 attributes (24 inputs <strong>and</strong> 2 attributes foroutcome heuristic calculations) <strong>and</strong> 839 patient records. They are the common attributes derived fromthe Hull (498 cases) <strong>and</strong> the Dundee (341 cases) data sites. The data structure can be seen in Table C14below.Step 2 (Clean/Transform/Filter):Cleaning task: The missing values are filled as the same method above (the mean for thecontinuous attributes) <strong>and</strong> the mode (for the categorical or Boolean attributes). However, aspecialized heuristic transformation for the missing values in attribute “PATCH” is used. Therewere 253 missing values in both sites where 243 missing values in the Hull site, <strong>and</strong> 10 missingvalues in the Dundee site. Therefore, missing values are replaced by the mode values of PATCHattribute in the Hull site (“Dacron”) whereas they are replaced by “PTFE” in the Dundee site.Transformation task: The continuous values in data set are also purely transformed into therange [0,1] with the linear transformation method indicated in Chapter 5 (section 5.3.3 - “<strong>Data</strong>Preparation Strategy”).Filtering task: The final data set contains 25 instead of 26 input attributes after removing the“empty” one (COMP_GROUP attribute has 650/839 missing values). The outcome for modelCM1 is created by the heuristic formula based on “PATIENT_STATUS “ attribute as follows:IF PATIENT_STATUS = “Dead” “High risk”Otherwise, “Low risk”Hence, CM1 has 126 values of “High risk” <strong>and</strong> 713 values of “Low risk” respectively.182Attribute nameAttributetypeMissingvaluesAttribute valuesMax Freq/Mean

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!