I'r - Memorial University of Newfoundland
I'r - Memorial University of Newfoundland
I'r - Memorial University of Newfoundland
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Selecting lIt1J'iahlu: It was demonstrated that to ovm;ome the probkm <strong>of</strong><br />
dimensionality in mult ivariate statistics the ratio <strong>of</strong> Dumber<strong>of</strong> cases coDurnbet' <strong>of</strong><br />
variables should not exceed three (Howarth andSinding.Larscn. 1983). Thus. for a data<br />
set consisting <strong>of</strong>23 cases no more than 7 variables can be util lzcd at a time.. Therefore<br />
the first proble m was selection <strong>of</strong>7 variables for each data sec which would represent the<br />
wh ole PAM assemblage in further analyses. Altbough different approaches to variable<br />
selection were used,a precaution was alw ays taken to keep compo wtds with different<br />
mo lecular weigh ts more or less equally represented.<br />
One <strong>of</strong>the approaches to variable selection is ana lysis o f thc correlation matrix for<br />
the data set andexcluding one <strong>of</strong>two highly correlatedvariables (e.g.. Vajnovslcy and<br />
Malinin. 1992). This approach was accepted fOI" the data on molecular composition. The<br />
significance <strong>of</strong> correlations within the mo lecular data set is represented by Figure 3.2.2<br />
(5C:e also Appendix AS). This figure shows that most cfthe variables are highly<br />
correlated. The seven least correlated variables (Ae, Fl, Fa, BbF, Per, DBA, BP) were<br />
chosen for further statistical analyses .<br />
A similar analysis <strong>of</strong> the correlation matrix for isotopic compositioo. data (Appendix.<br />
A9 ) showed much lower significance levels (Fig. 3-2.3). This implied that in this case<br />
mo st variables could not be predicted from each other. However. since determination <strong>of</strong><br />
SIlC in many cases was impeded by low signal to noise ratio or low concentrations, the<br />
data set contains numerous cases with missing values (Appendix Al). Therefore, the<br />
69