12.07.2015 Views

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Filtering task: The expected outcomes are calculated based on two attributes of “PATIENTSTATUS” <strong>and</strong> “COMBINE” as the following heuristic rules. Note that the expected outcomeshere are used for comparison purpose only with the clustering results.(PATIENT STATUS, COMBINE) = 0 “Low risk”(PATIENT STATUS, COMBINE) = 1 ”Medium risk”(PATIENT STATUS, COMBINE) = 2 ”High risk”Hence, the CM3bD model contains 48 “High risk”; 73 “Medium risk”; <strong>and</strong> 220 “Low risk” patterns.Step 3 (<strong>Data</strong> Mining Techniques): A SOM Toolbox clustering tool in the Matlab software package(SOM toolbox, 2000-2005) is used. <strong>Data</strong> is stored in a matrix of 341 x 16 (341 rows <strong>and</strong> 16 columns).A map with a size of [30, 16] is created. Note that value of 30 is length (munits); value of 16 (numberof attributes) is dim of the map; <strong>and</strong> the size of [30,16] is based on a heuristic formulas for “munits”(Alhoniemi et al, 2005) as:(5*341 (rows) ^0.54321)/4 30Figure C1: The U-matrix <strong>and</strong> each component plane for model CM3bD.The data is trained with the Best Matching Unit algorithm (BMU- SOM toolbox, 2000-2005). Thismeans all the distances between input vectors <strong>and</strong> map units (map nodes) is calculated. The greatestsimilarity (minimum Euclidean distance) to input vectors is chosen. The node here is so called winnernode. The map weight is updated; <strong>and</strong> self organizing map algorithm is continued until all the input177

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!