14.03.2014 Views

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

Modeling and Multivariate Methods - SAS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 21 Fitting Partial Least Squares Models 507<br />

Overview of the Partial Least Squares Platform<br />

Overview of the Partial Least Squares Platform<br />

The Partial Least Squares (PLS) platform fits linear models based on linear combinations, called factors, of<br />

the explanatory variables (Xs). These factors are obtained in a way that attempts to maximize the covariance<br />

between the Xs <strong>and</strong> the response or responses (Ys). In this way, PLS exploits the correlations between the Xs<br />

<strong>and</strong> the Ys to reveal underlying latent structures.<br />

The PLS approach to model fitting is particularly useful when there are more explanatory variables than<br />

observations or when the explanatory variables are highly correlated. You can use PLS to fit a single model<br />

to several responses simultaneously. (Wold, 1995; Wold et al, 2001, Eriksson et al, 2006).<br />

Two model fitting algorithms are available: nonlinear iterative partial least squares (NIPALS) <strong>and</strong> a<br />

“statistically inspired modification of PLS” (SIMPLS) (de Jong, 1993; Boulesteix <strong>and</strong> Strimmer, 2006). The<br />

SIMPLS algorithm was developed with the goal of solving a specific optimality problem. For a single<br />

response, both methods give the same model. For multiple responses, there are slight differences.<br />

The platform uses the van der Voet T 2 test <strong>and</strong> cross validation to help you choose the optimal number of<br />

factors to extract.<br />

• In JMP, the platform uses the leave-one-out method of cross validation.<br />

• In JMP Pro, you can choose KFold or r<strong>and</strong>om holdback cross validation, or you can specify a<br />

validation column. If you prefer, you can also turn off validation. Note that leave-one-out cross<br />

validation can be obtained by setting the number of folds in KFold equal to the number of rows.<br />

In JMP Pro, in addition to fitting main effects, you can also fit polynomial, interaction, <strong>and</strong> categorical<br />

effects by using the Partial Least Squares personality in Fit Model.<br />

Example of Partial Least Squares<br />

We consider an example from spectrometric calibration, <strong>and</strong> area where partial least squares is very effective.<br />

Consider the Baltic.jmp data table. The data are reported in Umetrics (1995); the original source is<br />

Lindberg, Persson, <strong>and</strong> Wold (1983). Suppose that you are researching pollution in the Baltic Sea. You<br />

would like to use the spectra of samples of sea water to determine the amounts of three compounds that are<br />

present in these samples. The three compounds of interest are: lignin sulfonate (ls), which is pulp industry<br />

pollution; humic acid (ha), a natural forest product; <strong>and</strong> an optical whitener from detergent (dt). The<br />

amounts of these compounds in each of the samples are the responses, The predictors are spectral emission<br />

intensities measured at a range of wavelengths (v1–v27).<br />

For the purposes of calibrating the model, samples with known compositions are used. The calibration data<br />

consist of 16 samples of known concentrations of lignin sulfonate, humic acid, <strong>and</strong> detergent, with emission<br />

intensities recorded at 27 equidistant wavelengths. Use the Partial Least Squares platform to build a model<br />

for predicting the amount of the compounds from the spectral emission intensities.<br />

In the following, we describe the steps <strong>and</strong> show reports for JMP. Because different cross validation options<br />

are used in JMP Pro, JMP Pro results for the default Model Launch settings differ from those shown in the<br />

figures below. To duplicate the results below in JMP Pro, in the Model Launch window, enter 16 as the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!