06.07.2015 Views

LISA: Using JMP to design experiments and analyze the results

LISA: Using JMP to design experiments and analyze the results

LISA: Using JMP to design experiments and analyze the results

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>LISA</strong>: <strong>Using</strong> <strong>JMP</strong> <strong>to</strong> <strong>design</strong><br />

<strong>experiments</strong> <strong>and</strong> <strong>analyze</strong> <strong>the</strong> <strong>results</strong><br />

Liaosa Xu<br />

Sept, 2012<br />

1


Course Outline<br />

Why We Need Cus<strong>to</strong>m Design<br />

The General Approach<br />

<strong>JMP</strong> Examples<br />

Potential Collinearity Issues<br />

Prior Design Evaluations<br />

Augmented Design<br />

Design from C<strong>and</strong>idate Set<br />

2


Why Cus<strong>to</strong>m Design<br />

Sometimes st<strong>and</strong>ard <strong>design</strong>s may not work,<br />

Computer generated (Cus<strong>to</strong>m/Optimal) <strong>design</strong>s<br />

are alternatives.<br />

An irregular experimental region<br />

Involving categorical <strong>and</strong> continuous variables<br />

A nonst<strong>and</strong>ard model<br />

Unusual sample size requirements<br />

3


Why Cus<strong>to</strong>m Design<br />

An irregular experimental region (Montgomery 2009)<br />

<br />

If <strong>the</strong> region of interest for <strong>the</strong> experiment is not a cube or a sphere.<br />

st<strong>and</strong>ard <strong>design</strong>s may not be possible.<br />

An experimenter is investigating <strong>the</strong> properties<br />

of a particular adhesive. x 1 is <strong>the</strong> amount of<br />

adhesive, x 2 is <strong>the</strong> cure temperature. The prior<br />

knowledge is:<br />

a) If <strong>to</strong>o little adhesive <strong>and</strong> <strong>to</strong>o low cure<br />

temperature, <strong>the</strong> parts will not bond.<br />

b) If both fac<strong>to</strong>rs are at high levels, <strong>the</strong> parts will<br />

be ei<strong>the</strong>r damaged by heat stress or an<br />

inadequate bond will result.<br />

4


Why Cus<strong>to</strong>m Design<br />

Categorical Variables<br />

Cus<strong>to</strong>m <strong>design</strong> can obtain a model in <strong>the</strong><br />

presence of categorical variables with multiple<br />

levels.<br />

Examples of categorical fac<strong>to</strong>rs are machine,<br />

opera<strong>to</strong>r, solvent <strong>and</strong> catalyst.<br />

5


Why Cus<strong>to</strong>m Design<br />

A nonst<strong>and</strong>ard model<br />

Sometimes <strong>the</strong> experimenter may have some special<br />

knowledge or insight about <strong>the</strong> process being studied<br />

that may suggest a nonst<strong>and</strong>ard model (specific<br />

interaction terms <strong>and</strong> specific quadratic terms.<br />

For example, <strong>the</strong> model proposed from prior knowledge<br />

is<br />

<br />

<br />

Note: this is not full response surface model<br />

6


Why Cus<strong>to</strong>m Design<br />

Unusual sample size requirements<br />

Occasionally, we may need reduce <strong>the</strong> runs required by<br />

st<strong>and</strong>ard <strong>design</strong>s.<br />

For example, we intend <strong>to</strong> fit a second-order model with<br />

four variables. The model has 15 terms <strong>to</strong> estimate. Central<br />

composite <strong>design</strong> (CCD) requires 26-30 runs. Since <strong>the</strong><br />

runs are expensive or time-consuming, we only could<br />

afford less than 20 runs. We can use computer-generated<br />

<strong>design</strong> <strong>to</strong> reduce <strong>the</strong> number of runs.<br />

7


The General Approach of Cus<strong>to</strong>m Design<br />

The usual approach for Cus<strong>to</strong>m Design is:<br />

1) Specify a model<br />

2) Determine <strong>the</strong> region of interest<br />

Linear Constraints<br />

3) Select number of runs <strong>to</strong> make<br />

4) Specify <strong>the</strong> optimality criterion<br />

5) Create <strong>the</strong> <strong>design</strong>, consider adding some center-point runs.<br />

8


Model Specification in Cus<strong>to</strong>m Design<br />

The Model specification:<br />

All <strong>design</strong>s are model dependent.<br />

By default, <strong>JMP</strong> put all main effects as <strong>the</strong> model terms.<br />

Consider adding two way interactions between each pair of<br />

fac<strong>to</strong>rs.<br />

For prediction purpose, consider using Response Surface<br />

Model, with I-Optimal criterion.<br />

Use your educated guess <strong>to</strong> specify <strong>the</strong> model terms.<br />

9


Optimality Criterion in Cus<strong>to</strong>m Design<br />

Cus<strong>to</strong>m <strong>design</strong> is also called optimal <strong>design</strong> since it is <strong>the</strong> best<br />

with respect <strong>to</strong> some criterion.<br />

Popular choice is D-Optimal <strong>design</strong>, which gives <strong>the</strong> most<br />

precise estimate of <strong>the</strong> effects jointly.<br />

D-Optimal <strong>design</strong>s are most useful <strong>to</strong> determine <strong>the</strong><br />

important fac<strong>to</strong>rs in <strong>the</strong> model . (Most appropriate for<br />

screening experiment)<br />

D-Optimal <strong>design</strong>s are not preferred when primary goal is<br />

prediction.<br />

10


Optimality Criterion in Cus<strong>to</strong>m Design<br />

Ano<strong>the</strong>r choice in <strong>JMP</strong> is I-Optimal <strong>design</strong>, which<br />

seeks <strong>to</strong> minimize <strong>the</strong> average prediction variance<br />

over <strong>the</strong> <strong>design</strong> space.<br />

When <strong>the</strong> prediction ability of <strong>the</strong> model is <strong>the</strong><br />

major concern, <strong>the</strong> I-Optimal Design is preferred.<br />

<strong>JMP</strong> selects <strong>the</strong> I-Optimal Design by default for<br />

response surface <strong>design</strong>s.<br />

11


<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />

A three fac<strong>to</strong>r (two numerical plus one categorical) <strong>design</strong> was used <strong>to</strong><br />

determine <strong>the</strong> operating conditions for modeling <strong>the</strong> amount of<br />

extraction.<br />

x 1 : centrifuge inlet temperature [40, 80]<br />

x 2 : extraction temperature [40, 60]<br />

<br />

<br />

x 3 : solvent A, B <strong>and</strong> C<br />

We use indica<strong>to</strong>r variable z 1 <strong>and</strong> z 2 <strong>to</strong> denote x 3 ’s discrete levels<br />

z<br />

1<br />

1,<br />

if Ais<br />

assigned<br />

<br />

0,<br />

o<strong>the</strong>rwise<br />

The response surface model can be written as<br />

z<br />

2<br />

Centrifuge inlet minus extraction<br />

>=0<br />

i.e., x 1 -x 2 ≥0<br />

1,<br />

if Bisassigned<br />

<br />

0,<br />

o<strong>the</strong>rwise<br />

2 2<br />

y 0 1x1 2x2 12x1x2 11x1 22x2<br />

z z zx zx zx zx <br />

1 1 2 2 11 1 1 21 2 1 12 1 2 22 2 2


<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />

Design space for x 1 <strong>and</strong> x 2<br />

60<br />

x 1 -x 2 ≥0<br />

x 2<br />

50<br />

Feasible<br />

Region<br />

40<br />

40 50 60 70 80<br />

x 1<br />

13


<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />

<strong>JMP</strong>->DOE->Cus<strong>to</strong>m Design<br />

Fac<strong>to</strong>rs<br />

Constraints<br />

Model Specification


<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />

Set R<strong>and</strong>om Seed <strong>to</strong> be<br />

1000, (Illustration only,<br />

not in Practice!!)<br />

Simulate Responses<br />

(used for<br />

collinearity<br />

detection later)<br />

Optimality<br />

Criterion<br />

Set Number of Starts <strong>to</strong><br />

be 1000<br />

Let’s change <strong>to</strong><br />

D optimality<br />

Number of<br />

Runs<br />

Click here <strong>to</strong> generate <strong>design</strong>


<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />

Design Output<br />

After you create <strong>the</strong> <strong>design</strong>, get<br />

some rough evaluations of your<br />

<strong>design</strong> before you run it!<br />

Evaluation Information<br />

(More details later)<br />

R<strong>and</strong>omize your Design<br />

<strong>to</strong> make table!!


<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />

Simulated response<br />

You can have your data here!


Collinearity Problems<br />

Because <strong>the</strong> cus<strong>to</strong>m <strong>design</strong>s considered were not orthogonal,<br />

multicollinearity is possible.<br />

Multicollinearity occurs when two or more predic<strong>to</strong>rs in <strong>the</strong><br />

model are correlated <strong>and</strong> provide redundant information about<br />

<strong>the</strong> response.<br />

It was considered a potential problem for three reasons:<br />

a ) Large variances <strong>and</strong> covariances when estimating <strong>the</strong><br />

regression coefficients<br />

b) The instability <strong>and</strong> wrong sign of regression coefficients. A<br />

little “perturbing” in response variables would lead <strong>to</strong> <strong>the</strong> large<br />

change of effects estimation or even opposite signs.<br />

c) Often confusing <strong>and</strong> misleading <strong>results</strong><br />

18


Collinearity Problems<br />

Detecting multicollinearity<br />

Calculate <strong>the</strong> variance inflation fac<strong>to</strong>rs (VIF) for each predic<strong>to</strong>r x j :<br />

19


Collinearity Problems<br />

Detecting multicollinearity using <strong>JMP</strong><br />

▼Red Triangle (next <strong>to</strong><br />

“Model” in <strong>the</strong> upper-left<br />

panel of <strong>the</strong> data table) →<br />

Run Script<br />

Click <strong>the</strong> Run Model<br />

but<strong>to</strong>n<br />

Scroll <strong>to</strong> <strong>the</strong> Parameter<br />

Estimates section<br />

Right click on <strong>the</strong> table<br />

→ Columns → VIF<br />

20


Collinearity Problems<br />

Detecting multicollinearity using <strong>JMP</strong><br />

No VIF is larger than 10, no severe multicollinearity in<br />

this <strong>design</strong><br />

21


Design Evaluations:<br />

The prediction variance for any fac<strong>to</strong>r setting or<br />

overall <strong>design</strong> space is <strong>the</strong> product of <strong>the</strong> error<br />

variance <strong>and</strong> a quantity that depends on <strong>the</strong><br />

<strong>design</strong> <strong>and</strong> <strong>the</strong> fac<strong>to</strong>r setting.<br />

This ratio, called <strong>the</strong> relative variance of<br />

prediction, can be calculated before acquiring <strong>the</strong><br />

data.<br />

It is ideal for <strong>the</strong> prediction variance <strong>to</strong> be small<br />

throughout <strong>the</strong> allowable regions of <strong>the</strong> fac<strong>to</strong>rs.


Design Evaluations-Prediction<br />

Design Evaluations:<br />

Variance Profile<br />

Prediction Variance Profile<br />

The prediction variance 0.5 is<br />

relative <strong>to</strong> <strong>the</strong> error variance.<br />

For example, if <strong>the</strong> estimated<br />

(prior) variance of experimental<br />

error (MSE) is 10, <strong>the</strong>n <strong>the</strong><br />

prediction variance of y at<br />

center value of x 1 (=0) is<br />

10*0.5=5.<br />

Control-click on <strong>the</strong> fac<strong>to</strong>r <strong>to</strong> set a fac<strong>to</strong>r level of your choice.<br />

You can drag <strong>the</strong> vertical trace<br />

lines <strong>to</strong> change <strong>the</strong> fac<strong>to</strong>r<br />

settings <strong>to</strong> different points.


Design Evaluations:<br />

Prediction Variance Profile<br />

Maximum Desirability comm<strong>and</strong> on <strong>the</strong> Prediction Variance Profile title<br />

bar identifies <strong>the</strong> maximum (as <strong>the</strong> worst case) prediction variance<br />

(1.321) for <strong>the</strong> model.<br />

Comparing <strong>the</strong> prediction variance profilers for two <strong>design</strong>s side-by-side<br />

is one way <strong>to</strong> compare two <strong>design</strong>s.


Design Evaluations:<br />

Fraction of Design Space<br />

The Fraction of Design Space (FDS)<br />

plot is a way <strong>to</strong> see how much of <strong>the</strong><br />

model prediction variance lies above (or<br />

below) a given value.<br />

The X axis is <strong>the</strong> proportion or<br />

percentage of prediction space, ranging<br />

from 0 <strong>to</strong> 100%, <strong>and</strong> <strong>the</strong> Y axis is <strong>the</strong><br />

range of prediction variance values.<br />

Note: 90 th quantile prediction<br />

variance value is well-suited in a<br />

variety of scenarios.<br />

<strong>Using</strong> <strong>the</strong> crosshair <strong>to</strong>ol shows that 90%<br />

of <strong>the</strong> possible fac<strong>to</strong>r settings have a<br />

relative predictive variance less than<br />

0.91.


Design Evaluations-Design<br />

Diagnostics<br />

Design Diagnostics<br />

These efficiency measures are single<br />

numbers attempting <strong>to</strong> quantify one <strong>design</strong><br />

characteristic. While <strong>the</strong> maximum efficiency<br />

is 100 for any criterion, an efficiency of 100%<br />

is impossible for many <strong>design</strong> problems.<br />

D: Minimize <strong>the</strong> joint confidence<br />

region for regression coefficients.<br />

It is best <strong>to</strong> use <strong>the</strong>se <strong>design</strong> measures <strong>to</strong><br />

compare two competitive <strong>design</strong>s with <strong>the</strong><br />

same model <strong>and</strong> number of runs ra<strong>the</strong>r than<br />

as some absolute measure of <strong>design</strong> quality.<br />

G: Minimize <strong>the</strong> maximum scaled<br />

prediction variance.<br />

A: Minimize <strong>the</strong> sum variances of<br />

all regression coefficients.


Why Augmented Design?<br />

Experimentation is an iterative process, we can not assume<br />

that one successful screening experiment has optimized<br />

our process.<br />

Four common reasons for an unsatisfied experiment:<br />

The specified model is inadequate.<br />

The <strong>results</strong> predicted from <strong>the</strong> experiment are not<br />

reproducible.<br />

Many trials failed.<br />

Important conditions, often an optimum, lie outside <strong>the</strong><br />

experimental region.<br />

27


Motivation for Augmented Design<br />

----Model Inadequacy<br />

The inadequacies of <strong>the</strong> model may be revealed during<br />

<strong>the</strong> analysis of <strong>the</strong> data.<br />

The investigated relationship may be more complicated<br />

than expected.<br />

Inclusion of higher-order polynomial terms<br />

Transformations of fac<strong>to</strong>rs or response<br />

Detection of ‘lurking’ variables.<br />

Ranges of some fac<strong>to</strong>rs are wrong*.<br />

28


Motivation for Augmented Design<br />

----Failing Trials<br />

If many individual trials fail, <strong>the</strong>re may not<br />

be sufficient data <strong>to</strong> estimate <strong>the</strong><br />

parameters of <strong>the</strong> model.<br />

It’s important <strong>to</strong> find out if <strong>the</strong>re is some<br />

technical mishap or whe<strong>the</strong>r <strong>the</strong>re is<br />

something more fundamentally amiss.<br />

29


Motivation for Augmented Design<br />

----Optimum Issues<br />

To define an optimum of <strong>the</strong> response or of some<br />

performance characteristic, this seems <strong>to</strong> lie appreciably<br />

outside <strong>the</strong> present experimental region, experimental<br />

confirmation of this prediction will be necessary.<br />

30


Motivation for Augmented Design<br />

We now consider <strong>the</strong> augmentation of a <strong>design</strong> by <strong>the</strong><br />

addition of a specified number of new trials.<br />

The augmentation includes <strong>the</strong> need for a higher-order<br />

model, a different <strong>design</strong> region, <strong>the</strong> introduction of a<br />

new fac<strong>to</strong>r or deduction of non-important fac<strong>to</strong>rs.<br />

The new <strong>design</strong> will depend on <strong>the</strong> trials for which <strong>the</strong><br />

response is known, although not usually on <strong>the</strong> values<br />

of <strong>the</strong> responses.<br />

31


Examples of Augmented Design<br />

Example A chemical engineer investigates <strong>the</strong> effects of six<br />

fac<strong>to</strong>rs on <strong>the</strong> percent reaction yield of a chemical process. A 2 6-2<br />

fractional fac<strong>to</strong>rial <strong>design</strong> (screening <strong>design</strong>) is implemented with<br />

2 additional center runs.<br />

The data has shown <strong>the</strong> significant curvature for some fac<strong>to</strong>rs,<br />

but <strong>the</strong> collinearity prevents us <strong>to</strong> identify <strong>and</strong> determine <strong>the</strong><br />

correct fac<strong>to</strong>r(s).<br />

The engineers also would like <strong>to</strong> know if it is beneficial <strong>to</strong><br />

increase <strong>the</strong> amount of catalyst <strong>to</strong> have a higher yield from<br />

current concentration of 0.2M.<br />

With 3 slected fac<strong>to</strong>rs <strong>and</strong> catalyst concentrations, <strong>the</strong><br />

augmented <strong>design</strong> with 8 additional runs is used <strong>to</strong> fit <strong>the</strong><br />

response surface model for <strong>the</strong> four fac<strong>to</strong>rs.


Augmented Design in <strong>JMP</strong><br />

Open Augmented <strong>design</strong> 18 Runs.<strong>JMP</strong> in <strong>JMP</strong><br />

Right click red triangle next <strong>to</strong> Screening Run Script<br />

33


Augmented Design in <strong>JMP</strong><br />

34<br />

The screening analysis indicates that <strong>the</strong>re is<br />

curvature effect for X3, but it is aliased<br />

completely with all o<strong>the</strong>r quadratic fac<strong>to</strong>rs<br />

X1 2 <strong>and</strong> X4 2 .


Augmented Design in <strong>JMP</strong><br />

Right click red triangle next <strong>to</strong> Full Fac<strong>to</strong>rial Model with X1, X3 <strong>and</strong><br />

X4 Run Script<br />

The lack of fit<br />

test indicates<br />

that <strong>the</strong> full<br />

fac<strong>to</strong>rial<br />

model is not<br />

adequate.<br />

35


Augmented Design in <strong>JMP</strong><br />

Augmented <strong>design</strong> with different model specification<br />

DOE Augment Design<br />

Click OK.<br />

36


Augmented Design in <strong>JMP</strong><br />

Augmentation Choices Augment<br />

You can change <strong>the</strong><br />

upper <strong>and</strong> lower<br />

limit for new runs,<br />

which would give<br />

you different<br />

prediction region.<br />

Note, catalyst now<br />

is with upper<br />

bound 0.8.<br />

Usually, <strong>the</strong> additional runs would be performed in different<br />

day, we may consider <strong>the</strong> day as blocking effect in <strong>the</strong><br />

model <strong>and</strong> check this box.<br />

37


Augmented Design in <strong>JMP</strong><br />

Specify RSM<br />

Again, <strong>design</strong> is model<br />

dependent, specify <strong>the</strong><br />

correct model is crucial<br />

for a satisfac<strong>to</strong>ry<br />

experiment, in <strong>the</strong><br />

augmented <strong>design</strong>,<br />

usually you would<br />

change some model<br />

terms such as<br />

interactions, quadratic<br />

effects, etc.<br />

38


Augmented Design in <strong>JMP</strong><br />

Choose <strong>the</strong> optimality<br />

criterion <strong>to</strong> be I-<br />

optimal.<br />

Number of Starts <strong>to</strong><br />

be 1000.<br />

R<strong>and</strong>om seed <strong>to</strong> be<br />

1000.<br />

Specify <strong>the</strong> <strong>to</strong>tal number<br />

of runs including original<br />

runs (18+8=26)<br />

Click Make Design.<br />

Click Make Table.<br />

39


Augmented Design in <strong>JMP</strong><br />

New runs<br />

grouped in<br />

block 2.<br />

40


Augmented Design in <strong>JMP</strong><br />

○ Additional Runs<br />

∆ Original Design points<br />

For X1, X3 <strong>and</strong> X4, <strong>the</strong><br />

original runs are generated<br />

on corner <strong>and</strong> center points,<br />

<strong>the</strong> added runs are generated<br />

from axial <strong>and</strong> facial points.


Augmented Design in <strong>JMP</strong><br />

Open Augmented <strong>design</strong> 26 Runs.<strong>JMP</strong> in <strong>JMP</strong><br />

Right click red triangle next <strong>to</strong> Model Run Script<br />

The lack of fit test is not significant<br />

for RS Model.<br />

From <strong>the</strong> effect tests, X1 2 <strong>and</strong> X3 2<br />

quadratic effects are significant <strong>and</strong><br />

catalyst effect is significant at<br />

alpha=0.1.<br />

42


Augmented Design in <strong>JMP</strong><br />

Right click red triangle next <strong>to</strong> Reduced Model Run Script<br />

The lack of fit test is not significant<br />

for RS Model.<br />

All model effects are significant at<br />

alpha=0.05.<br />

43


Augmented Design in <strong>JMP</strong><br />

Right click red triangle next <strong>to</strong> Prediction Profile Maximize Desirability<br />

The predicted yield is maximized at X1=1, X3=1, X4=-1 <strong>and</strong><br />

Catalyst=0.8 with predicted value 27.25. Confirm it as an<br />

additional run.<br />

44


Design from a C<strong>and</strong>idate Set<br />

Motivations for Designs Based on a C<strong>and</strong>idate Set<br />

What <strong>to</strong> consider when using C<strong>and</strong>idate Set<br />

Design<br />

C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />

45


What is a C<strong>and</strong>idate Set?<br />

C<strong>and</strong>idate Set of Design Points - are <strong>the</strong> <strong>to</strong>tal group of<br />

possible data points from which <strong>the</strong> actual <strong>design</strong> points can<br />

be chosen.<br />

For example: <strong>to</strong> construct quantitative structure–activity<br />

relationship (QSAR) models, which help summarize a supposed<br />

relationship between chemical structures <strong>and</strong> biological activity<br />

of chemicals, <strong>the</strong> chemist may search <strong>the</strong> chemical compound<br />

database <strong>to</strong> get some c<strong>and</strong>idate compounds.<br />

A QSAR has <strong>the</strong> form as:<br />

46


Why Design Based on a C<strong>and</strong>idate Set<br />

We may not have full control of experimental<br />

fac<strong>to</strong>rs <strong>and</strong> are limited <strong>to</strong> choice of some fac<strong>to</strong>r<br />

combinations<br />

The <strong>design</strong> space is complex with irregular fac<strong>to</strong>r<br />

settings <strong>and</strong> complicated non-linear fac<strong>to</strong>r<br />

constraints.<br />

As <strong>the</strong> term suggests, C<strong>and</strong>idate Set Design help us<br />

pick <strong>the</strong> best <strong>design</strong> points from <strong>the</strong> c<strong>and</strong>idate set<br />

with respect <strong>to</strong> some criteria.<br />

47


Design from a C<strong>and</strong>idate Set<br />

Example Mitchell (1974a): An animal scientist wants <strong>to</strong> compare wildlife densities<br />

in four different habitats over a year. However, due <strong>to</strong> <strong>the</strong> cost of experimentation,<br />

only 12 observations can be made. The following model is postulated for <strong>the</strong><br />

density in habitat during month :<br />

<br />

This model includes <strong>the</strong> habitat as a classification variable , <strong>the</strong> effect of time<br />

with an overall linear drift term , <strong>and</strong> cyclic behavior in <strong>the</strong> form of a Fourier<br />

series. There is no intercept term in <strong>the</strong> model.<br />

Note, <strong>the</strong>re are 12 Months <strong>and</strong> 4 habitats, we can create<br />

a c<strong>and</strong>idate set with 48 points.


Design from a C<strong>and</strong>idate Set<br />

Open Mitchell.csv in <strong>JMP</strong><br />

Data set contains <strong>the</strong> 48 c<strong>and</strong>idate points <strong>and</strong> includes <strong>the</strong> four cosine variables<br />

(c1, c2, c3, <strong>and</strong> c4) <strong>and</strong> three sine variables (s1, s2, <strong>and</strong>s3).


Design from a C<strong>and</strong>idate Set<br />

<strong>JMP</strong> will not do r<strong>and</strong>omization with C<strong>and</strong>idate Set Design,<br />

do it manually!<br />

Due <strong>to</strong> <strong>the</strong> limitation of <strong>design</strong> space, we may end up with<br />

<strong>design</strong> with severe collinearity<br />

Look at Variation Influence Fac<strong>to</strong>rs (VIF) <strong>to</strong> assess <strong>the</strong><br />

lack of orthogonality


Design from a C<strong>and</strong>idate Set<br />

Be sure <strong>to</strong> assess <strong>the</strong> quality of <strong>the</strong> <strong>design</strong> (e.g., FDS<br />

plots, statistical power – relative variance of coefficients)<br />

Check your final <strong>design</strong> space <strong>to</strong> diagnose possible<br />

problems<br />

Remember that each <strong>design</strong> point in <strong>the</strong> c<strong>and</strong>idate set can<br />

only be selected once


C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />

Change <strong>the</strong> data type of some fac<strong>to</strong>rs in <strong>JMP</strong> file<br />

Right click on Habitat column → Column<br />

Information→ Data Type → Character→OK<br />

You need specify <strong>the</strong> correct data type in this step since we<br />

can’t change this in cus<strong>to</strong>m <strong>design</strong>.


C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />

DOE Cus<strong>to</strong>m Design Add Fac<strong>to</strong>r Covariate<br />

Have <strong>to</strong> select one<br />

fac<strong>to</strong>r at a time in<br />

<strong>JMP</strong> 9.<br />

<strong>JMP</strong> will treat Habitat as original categorical data type in<br />

c<strong>and</strong>idate set file


C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />

Model Specification


C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />

Cus<strong>to</strong>m Design ▼ Optimality Criterion Make D-Optimal Design<br />

Set seed <strong>to</strong> be 193030034.<br />

Set number of start <strong>to</strong> be 100.<br />

Specify 12 in Number<br />

of Runs<br />

Click Make Design<br />

Click Make Table


C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />

C<strong>and</strong>idate set <strong>design</strong> <strong>and</strong> evaluations<br />

Non-r<strong>and</strong>omized<br />

Note for this <strong>design</strong>, we actually can not do<br />

r<strong>and</strong>omization due <strong>to</strong> <strong>the</strong> property of Month fac<strong>to</strong>r. Let’s<br />

assume it can be r<strong>and</strong>omized for illustration purpose.


C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />

To do r<strong>and</strong>omization <strong>and</strong> VIFs calculation manually<br />

in <strong>JMP</strong> you need response data<br />

Right click on Y column → Formula<br />

Scroll in <strong>the</strong> “Functions” box, choose R<strong>and</strong>om →<br />

R<strong>and</strong>om Uniform, <strong>the</strong>n click <strong>the</strong> OK but<strong>to</strong>n<br />

Right click on Y column → Sort<br />

▼Red Triangle (next <strong>to</strong> “Model” in <strong>the</strong> upper-left panel of<br />

<strong>the</strong> data table) → Run Script<br />

Check No intercept<br />

Click <strong>the</strong> Run Model but<strong>to</strong>n<br />

Scroll <strong>to</strong> <strong>the</strong> Parameter Estimates section<br />

Right click on <strong>the</strong> table → Columns → VIF


C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />

Run <strong>the</strong> <strong>experiments</strong><br />

according <strong>to</strong> this<br />

r<strong>and</strong>omized order.<br />

VIFs


More capabilities of <strong>JMP</strong><br />

1. Split-Plot/Split-Split-Plot Design<br />

2. Saturated <strong>and</strong> Supersaturated Design<br />

3. Mixture Design<br />

4. Choice Design Space<br />

5. Non-linear Design<br />

6. Space Filling Design / Design for Computer<br />

Experiments


Workshop<br />

Scenario 1: Suppose an experimenter is investigating <strong>the</strong> properties of a<br />

particular adhesive. is <strong>the</strong> amount of adhesive, is <strong>the</strong> cure temperature .<br />

The prior knowledge is:<br />

If <strong>to</strong>o little adhesive <strong>and</strong> <strong>to</strong>o low cure temperature, <strong>the</strong> parts will not bond.<br />

. <br />

If both fac<strong>to</strong>rs are at high levels, <strong>the</strong> parts will be ei<strong>the</strong>r damaged by heat stress<br />

or an inadequate bond will result.<br />

<br />

The model of interest is a non-st<strong>and</strong>ard model, i.e.,<br />

<br />

Also due <strong>to</strong> <strong>the</strong> budget constraint, only 30 runs can be offered.


Workshop<br />

Scenario 2: Meyer, et al. (1996), demonstrates how <strong>to</strong> use <strong>the</strong> augment<br />

<strong>design</strong>er in <strong>JMP</strong> <strong>to</strong> resolve ambiguities left by a screening <strong>design</strong>. In this study,<br />

a chemical engineer investigates <strong>the</strong> effects of five fac<strong>to</strong>rs on <strong>the</strong> percent<br />

reaction of a chemical process.<br />

To begin, open Reac<strong>to</strong>r 8 Runs.jmp, <strong>and</strong> augment this <strong>design</strong> with additional 8<br />

runs <strong>to</strong> incorporate All Two-Fac<strong>to</strong>r Interactions.<br />

Set seed <strong>to</strong> be 12834729.<br />

After you create <strong>the</strong> <strong>design</strong>, you can open Reac<strong>to</strong>r Augment<br />

Data.jmp <strong>to</strong> do <strong>the</strong> variable selection using stepwise regression.<br />

Note: Choose P-value Threshold from <strong>the</strong> S<strong>to</strong>pping Rule menu, Mixed from<br />

<strong>the</strong> Direction menu, <strong>and</strong> make sure Prob <strong>to</strong> Enter is 0.050 <strong>and</strong> Prob <strong>to</strong><br />

Leave is 0.100. These are not <strong>the</strong> default values.


Workshop<br />

Scenario 3: An au<strong>to</strong>motive engineer wants <strong>to</strong> fit a quadratic (Response<br />

Surface) model <strong>to</strong> fuel consumption data in order <strong>to</strong> find <strong>the</strong> values of <strong>the</strong><br />

control variables that minimize fuel consumption (refer <strong>to</strong> Vance 1986). The<br />

three control variables AFR (air fuel ratio), EGR (exhaust gas recirculation),<br />

<strong>and</strong> SA(spark advance) <strong>and</strong> <strong>the</strong>ir possible settings are shown in <strong>the</strong> following<br />

table:<br />

Variable Values<br />

AFR 15 16 17 18<br />

EGR 0.020 0.177 0.377 0.566 0.921 1.117<br />

SA 10 16 22 28 34 40 46 52<br />

Ra<strong>the</strong>r than run all 192 (4×6×8) combinations of <strong>the</strong>se fac<strong>to</strong>rs (saved as<br />

c<strong>and</strong>idate set workshop 3.csv), <strong>the</strong> engineer would like <strong>to</strong> see whe<strong>the</strong>r <strong>the</strong><br />

<strong>to</strong>tal number of runs can be reduced <strong>to</strong> 50 in an optimal fashion.


Design Evaluations:<br />

The Signal <strong>to</strong> Noise Ratio (often abbreviated SNR or S/N) is a measure<br />

used in science <strong>and</strong> engineering <strong>to</strong> quantify how much a signal has been<br />

corrupted by noise.<br />

It is defined as <strong>the</strong> ratio of signal power <strong>to</strong> <strong>the</strong> noise power corrupting <strong>the</strong><br />

signal. A ratio higher than 1:1 indicates more signal than noise.<br />

In less technical terms, signal-<strong>to</strong>-noise ratio compares <strong>the</strong> level of a<br />

desired signal (such as music) <strong>to</strong> <strong>the</strong> level of background noise. The<br />

higher <strong>the</strong> ratio, <strong>the</strong> less obtrusive <strong>the</strong> background noise is.<br />

In statistical analysis, signal in SNR is defined as <strong>the</strong> regression<br />

coefficient of model terms, noise is defined as experimental error (Model<br />

error) in terms of st<strong>and</strong>ard deviation.


Design Evaluations:<br />

The Power column shows <strong>the</strong><br />

power of <strong>the</strong> <strong>design</strong> as specified <strong>to</strong><br />

detect effects of a certain size<br />

(SNR) at given significance level.<br />

Here, assume <strong>the</strong> model error std.<br />

dev. ()=2.5, <strong>the</strong> true coefficient<br />

value of 2 for X2 is 2.5, <strong>the</strong>n <strong>the</strong><br />

SNR=2.5/2.5=1.0, <strong>the</strong> probability<br />

(Power) <strong>to</strong> identify such effect is<br />

0.402 at significance level 0.05.<br />

In <strong>JMP</strong>, you can change <strong>the</strong> SNR setting (e.g. Signal <strong>to</strong><br />

Noise=1 i.e. 2 /=1) <strong>and</strong> significance level (.05).


Design Evaluations:<br />

Design – Power of <strong>the</strong> Design<br />

e.x. If we consider <strong>the</strong> coefficient of<br />

Extraction Temp is sharp with respect<br />

<strong>to</strong> <strong>the</strong> r<strong>and</strong>om noise, say 2 is twice<br />

( 2 =5.0) as large as <strong>the</strong> Noise, i.e.<br />

SNR ( =2), we have more power<br />

<strong>to</strong> detect it (0.909 vs 0.402)


Design Evaluations:<br />

Significance Level increases, <strong>the</strong> Power increase.<br />

Signal <strong>to</strong> Noise Ratio increases, <strong>the</strong> Power increase.<br />

Note: If your <strong>design</strong> turns out <strong>to</strong> have very low power<br />

with even large Signal <strong>to</strong> Noise ratio settings, <strong>the</strong>n one<br />

needs <strong>to</strong> question whe<strong>the</strong>r it is worth running <strong>the</strong><br />

experiment!

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!