12.07.2015 Views

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

Predicting Cardiovascular Risks using Pattern Recognition and Data ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Note that “Expected High” means the expected “High risk” (“High risk” class), <strong>and</strong> “Cluster High”means “High risk” cluster. The same explanation is used for “Expected Low” <strong>and</strong> “Cluster Low”.From Table 7.13, the distances between cluster “Low risk” to “High risk” <strong>and</strong> “Low risk” classes(expectations) are negligible (nearly 0.00). Contrastingly, the distances between cluster “High risk” toboth these classes are quite far (2.01). This shows the reason for the poor clustering “High risk”performance as in their original distributions the patterns in the expected “High risk” class are not closeto each others.Table 7.14 shows the negligible distance (0.01) between correct “High risk” group <strong>and</strong> incorrect “Lowrisk” group (in the “High risk” cluster). This means these patterns are very closely distributed in dataspace. In other words, they have similar pattern forms. The same explanation might be used for the gapbetween correct “Low risk” group <strong>and</strong> incorrect “High risk” group (with a distance of 1.01).GroupsDistancesCorrect High – Incorrect High3.02(True Positive – False Negative)Correct High – Correct Low2.03(True Positive – True Negative)Correct High – Incorrect Low0.01(True Positive – False Positive)Correct Low – Incorrect High 1.01Correct Low – Incorrect Low2.02(True Negative – False Positive)Incorrect Low – Incorrect High3.01(False Positive – False Negative)Table 7.14: The distances between alternative groups in confusion matrix.Contrastingly, the distances between alternative clustering groups are more than double the two abovedistances. For example, the distance from the correct “High risk” patterns to the incorrect “High risk” is3.02. This means their distributions in data space are quite far from each other. Hence, the correct“High risk” pattern forms are different to the incorrect “High risk” ones. However, their patterns havethe same outcomes as labelled <strong>using</strong> the heuristic rules indicated in Chapter 6. From this result, it isstrongly suggested that the natural structure of the data (similar pattern forms) does not support thelabelled outcomes from the heuristic clinical models. In other words, the nature of the problem <strong>and</strong> the121

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!