13.07.2013 Views

I'r - Memorial University of Newfoundland

I'r - Memorial University of Newfoundland

I'r - Memorial University of Newfoundland

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Selecting lIt1J'iahlu: It was demonstrated that to ovm;ome the probkm <strong>of</strong><br />

dimensionality in mult ivariate statistics the ratio <strong>of</strong> Dumber<strong>of</strong> cases coDurnbet' <strong>of</strong><br />

variables should not exceed three (Howarth andSinding.Larscn. 1983). Thus. for a data<br />

set consisting <strong>of</strong>23 cases no more than 7 variables can be util lzcd at a time.. Therefore<br />

the first proble m was selection <strong>of</strong>7 variables for each data sec which would represent the<br />

wh ole PAM assemblage in further analyses. Altbough different approaches to variable<br />

selection were used,a precaution was alw ays taken to keep compo wtds with different<br />

mo lecular weigh ts more or less equally represented.<br />

One <strong>of</strong>the approaches to variable selection is ana lysis o f thc correlation matrix for<br />

the data set andexcluding one <strong>of</strong>two highly correlatedvariables (e.g.. Vajnovslcy and<br />

Malinin. 1992). This approach was accepted fOI" the data on molecular composition. The<br />

significance <strong>of</strong> correlations within the mo lecular data set is represented by Figure 3.2.2<br />

(5C:e also Appendix AS). This figure shows that most cfthe variables are highly<br />

correlated. The seven least correlated variables (Ae, Fl, Fa, BbF, Per, DBA, BP) were<br />

chosen for further statistical analyses .<br />

A similar analysis <strong>of</strong> the correlation matrix for isotopic compositioo. data (Appendix.<br />

A9 ) showed much lower significance levels (Fig. 3-2.3). This implied that in this case<br />

mo st variables could not be predicted from each other. However. since determination <strong>of</strong><br />

SIlC in many cases was impeded by low signal to noise ratio or low concentrations, the<br />

data set contains numerous cases with missing values (Appendix Al). Therefore, the<br />

69

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!