LISA: Using JMP to design experiments and analyze the results
LISA: Using JMP to design experiments and analyze the results
LISA: Using JMP to design experiments and analyze the results
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>LISA</strong>: <strong>Using</strong> <strong>JMP</strong> <strong>to</strong> <strong>design</strong><br />
<strong>experiments</strong> <strong>and</strong> <strong>analyze</strong> <strong>the</strong> <strong>results</strong><br />
Liaosa Xu<br />
Sept, 2012<br />
1
Course Outline<br />
Why We Need Cus<strong>to</strong>m Design<br />
The General Approach<br />
<strong>JMP</strong> Examples<br />
Potential Collinearity Issues<br />
Prior Design Evaluations<br />
Augmented Design<br />
Design from C<strong>and</strong>idate Set<br />
2
Why Cus<strong>to</strong>m Design<br />
Sometimes st<strong>and</strong>ard <strong>design</strong>s may not work,<br />
Computer generated (Cus<strong>to</strong>m/Optimal) <strong>design</strong>s<br />
are alternatives.<br />
An irregular experimental region<br />
Involving categorical <strong>and</strong> continuous variables<br />
A nonst<strong>and</strong>ard model<br />
Unusual sample size requirements<br />
3
Why Cus<strong>to</strong>m Design<br />
An irregular experimental region (Montgomery 2009)<br />
<br />
If <strong>the</strong> region of interest for <strong>the</strong> experiment is not a cube or a sphere.<br />
st<strong>and</strong>ard <strong>design</strong>s may not be possible.<br />
An experimenter is investigating <strong>the</strong> properties<br />
of a particular adhesive. x 1 is <strong>the</strong> amount of<br />
adhesive, x 2 is <strong>the</strong> cure temperature. The prior<br />
knowledge is:<br />
a) If <strong>to</strong>o little adhesive <strong>and</strong> <strong>to</strong>o low cure<br />
temperature, <strong>the</strong> parts will not bond.<br />
b) If both fac<strong>to</strong>rs are at high levels, <strong>the</strong> parts will<br />
be ei<strong>the</strong>r damaged by heat stress or an<br />
inadequate bond will result.<br />
4
Why Cus<strong>to</strong>m Design<br />
Categorical Variables<br />
Cus<strong>to</strong>m <strong>design</strong> can obtain a model in <strong>the</strong><br />
presence of categorical variables with multiple<br />
levels.<br />
Examples of categorical fac<strong>to</strong>rs are machine,<br />
opera<strong>to</strong>r, solvent <strong>and</strong> catalyst.<br />
5
Why Cus<strong>to</strong>m Design<br />
A nonst<strong>and</strong>ard model<br />
Sometimes <strong>the</strong> experimenter may have some special<br />
knowledge or insight about <strong>the</strong> process being studied<br />
that may suggest a nonst<strong>and</strong>ard model (specific<br />
interaction terms <strong>and</strong> specific quadratic terms.<br />
For example, <strong>the</strong> model proposed from prior knowledge<br />
is<br />
<br />
<br />
Note: this is not full response surface model<br />
6
Why Cus<strong>to</strong>m Design<br />
Unusual sample size requirements<br />
Occasionally, we may need reduce <strong>the</strong> runs required by<br />
st<strong>and</strong>ard <strong>design</strong>s.<br />
For example, we intend <strong>to</strong> fit a second-order model with<br />
four variables. The model has 15 terms <strong>to</strong> estimate. Central<br />
composite <strong>design</strong> (CCD) requires 26-30 runs. Since <strong>the</strong><br />
runs are expensive or time-consuming, we only could<br />
afford less than 20 runs. We can use computer-generated<br />
<strong>design</strong> <strong>to</strong> reduce <strong>the</strong> number of runs.<br />
7
The General Approach of Cus<strong>to</strong>m Design<br />
The usual approach for Cus<strong>to</strong>m Design is:<br />
1) Specify a model<br />
2) Determine <strong>the</strong> region of interest<br />
Linear Constraints<br />
3) Select number of runs <strong>to</strong> make<br />
4) Specify <strong>the</strong> optimality criterion<br />
5) Create <strong>the</strong> <strong>design</strong>, consider adding some center-point runs.<br />
8
Model Specification in Cus<strong>to</strong>m Design<br />
The Model specification:<br />
All <strong>design</strong>s are model dependent.<br />
By default, <strong>JMP</strong> put all main effects as <strong>the</strong> model terms.<br />
Consider adding two way interactions between each pair of<br />
fac<strong>to</strong>rs.<br />
For prediction purpose, consider using Response Surface<br />
Model, with I-Optimal criterion.<br />
Use your educated guess <strong>to</strong> specify <strong>the</strong> model terms.<br />
9
Optimality Criterion in Cus<strong>to</strong>m Design<br />
Cus<strong>to</strong>m <strong>design</strong> is also called optimal <strong>design</strong> since it is <strong>the</strong> best<br />
with respect <strong>to</strong> some criterion.<br />
Popular choice is D-Optimal <strong>design</strong>, which gives <strong>the</strong> most<br />
precise estimate of <strong>the</strong> effects jointly.<br />
D-Optimal <strong>design</strong>s are most useful <strong>to</strong> determine <strong>the</strong><br />
important fac<strong>to</strong>rs in <strong>the</strong> model . (Most appropriate for<br />
screening experiment)<br />
D-Optimal <strong>design</strong>s are not preferred when primary goal is<br />
prediction.<br />
10
Optimality Criterion in Cus<strong>to</strong>m Design<br />
Ano<strong>the</strong>r choice in <strong>JMP</strong> is I-Optimal <strong>design</strong>, which<br />
seeks <strong>to</strong> minimize <strong>the</strong> average prediction variance<br />
over <strong>the</strong> <strong>design</strong> space.<br />
When <strong>the</strong> prediction ability of <strong>the</strong> model is <strong>the</strong><br />
major concern, <strong>the</strong> I-Optimal Design is preferred.<br />
<strong>JMP</strong> selects <strong>the</strong> I-Optimal Design by default for<br />
response surface <strong>design</strong>s.<br />
11
<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />
A three fac<strong>to</strong>r (two numerical plus one categorical) <strong>design</strong> was used <strong>to</strong><br />
determine <strong>the</strong> operating conditions for modeling <strong>the</strong> amount of<br />
extraction.<br />
x 1 : centrifuge inlet temperature [40, 80]<br />
x 2 : extraction temperature [40, 60]<br />
<br />
<br />
x 3 : solvent A, B <strong>and</strong> C<br />
We use indica<strong>to</strong>r variable z 1 <strong>and</strong> z 2 <strong>to</strong> denote x 3 ’s discrete levels<br />
z<br />
1<br />
1,<br />
if Ais<br />
assigned<br />
<br />
0,<br />
o<strong>the</strong>rwise<br />
The response surface model can be written as<br />
z<br />
2<br />
Centrifuge inlet minus extraction<br />
>=0<br />
i.e., x 1 -x 2 ≥0<br />
1,<br />
if Bisassigned<br />
<br />
0,<br />
o<strong>the</strong>rwise<br />
2 2<br />
y 0 1x1 2x2 12x1x2 11x1 22x2<br />
z z zx zx zx zx <br />
1 1 2 2 11 1 1 21 2 1 12 1 2 22 2 2
<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />
Design space for x 1 <strong>and</strong> x 2<br />
60<br />
x 1 -x 2 ≥0<br />
x 2<br />
50<br />
Feasible<br />
Region<br />
40<br />
40 50 60 70 80<br />
x 1<br />
13
<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />
<strong>JMP</strong>->DOE->Cus<strong>to</strong>m Design<br />
Fac<strong>to</strong>rs<br />
Constraints<br />
Model Specification
<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />
Set R<strong>and</strong>om Seed <strong>to</strong> be<br />
1000, (Illustration only,<br />
not in Practice!!)<br />
Simulate Responses<br />
(used for<br />
collinearity<br />
detection later)<br />
Optimality<br />
Criterion<br />
Set Number of Starts <strong>to</strong><br />
be 1000<br />
Let’s change <strong>to</strong><br />
D optimality<br />
Number of<br />
Runs<br />
Click here <strong>to</strong> generate <strong>design</strong>
<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />
Design Output<br />
After you create <strong>the</strong> <strong>design</strong>, get<br />
some rough evaluations of your<br />
<strong>design</strong> before you run it!<br />
Evaluation Information<br />
(More details later)<br />
R<strong>and</strong>omize your Design<br />
<strong>to</strong> make table!!
<strong>JMP</strong> Example of Cus<strong>to</strong>m Design<br />
Simulated response<br />
You can have your data here!
Collinearity Problems<br />
Because <strong>the</strong> cus<strong>to</strong>m <strong>design</strong>s considered were not orthogonal,<br />
multicollinearity is possible.<br />
Multicollinearity occurs when two or more predic<strong>to</strong>rs in <strong>the</strong><br />
model are correlated <strong>and</strong> provide redundant information about<br />
<strong>the</strong> response.<br />
It was considered a potential problem for three reasons:<br />
a ) Large variances <strong>and</strong> covariances when estimating <strong>the</strong><br />
regression coefficients<br />
b) The instability <strong>and</strong> wrong sign of regression coefficients. A<br />
little “perturbing” in response variables would lead <strong>to</strong> <strong>the</strong> large<br />
change of effects estimation or even opposite signs.<br />
c) Often confusing <strong>and</strong> misleading <strong>results</strong><br />
18
Collinearity Problems<br />
Detecting multicollinearity<br />
Calculate <strong>the</strong> variance inflation fac<strong>to</strong>rs (VIF) for each predic<strong>to</strong>r x j :<br />
19
Collinearity Problems<br />
Detecting multicollinearity using <strong>JMP</strong><br />
▼Red Triangle (next <strong>to</strong><br />
“Model” in <strong>the</strong> upper-left<br />
panel of <strong>the</strong> data table) →<br />
Run Script<br />
Click <strong>the</strong> Run Model<br />
but<strong>to</strong>n<br />
Scroll <strong>to</strong> <strong>the</strong> Parameter<br />
Estimates section<br />
Right click on <strong>the</strong> table<br />
→ Columns → VIF<br />
20
Collinearity Problems<br />
Detecting multicollinearity using <strong>JMP</strong><br />
No VIF is larger than 10, no severe multicollinearity in<br />
this <strong>design</strong><br />
21
Design Evaluations:<br />
The prediction variance for any fac<strong>to</strong>r setting or<br />
overall <strong>design</strong> space is <strong>the</strong> product of <strong>the</strong> error<br />
variance <strong>and</strong> a quantity that depends on <strong>the</strong><br />
<strong>design</strong> <strong>and</strong> <strong>the</strong> fac<strong>to</strong>r setting.<br />
This ratio, called <strong>the</strong> relative variance of<br />
prediction, can be calculated before acquiring <strong>the</strong><br />
data.<br />
It is ideal for <strong>the</strong> prediction variance <strong>to</strong> be small<br />
throughout <strong>the</strong> allowable regions of <strong>the</strong> fac<strong>to</strong>rs.
Design Evaluations-Prediction<br />
Design Evaluations:<br />
Variance Profile<br />
Prediction Variance Profile<br />
The prediction variance 0.5 is<br />
relative <strong>to</strong> <strong>the</strong> error variance.<br />
For example, if <strong>the</strong> estimated<br />
(prior) variance of experimental<br />
error (MSE) is 10, <strong>the</strong>n <strong>the</strong><br />
prediction variance of y at<br />
center value of x 1 (=0) is<br />
10*0.5=5.<br />
Control-click on <strong>the</strong> fac<strong>to</strong>r <strong>to</strong> set a fac<strong>to</strong>r level of your choice.<br />
You can drag <strong>the</strong> vertical trace<br />
lines <strong>to</strong> change <strong>the</strong> fac<strong>to</strong>r<br />
settings <strong>to</strong> different points.
Design Evaluations:<br />
Prediction Variance Profile<br />
Maximum Desirability comm<strong>and</strong> on <strong>the</strong> Prediction Variance Profile title<br />
bar identifies <strong>the</strong> maximum (as <strong>the</strong> worst case) prediction variance<br />
(1.321) for <strong>the</strong> model.<br />
Comparing <strong>the</strong> prediction variance profilers for two <strong>design</strong>s side-by-side<br />
is one way <strong>to</strong> compare two <strong>design</strong>s.
Design Evaluations:<br />
Fraction of Design Space<br />
The Fraction of Design Space (FDS)<br />
plot is a way <strong>to</strong> see how much of <strong>the</strong><br />
model prediction variance lies above (or<br />
below) a given value.<br />
The X axis is <strong>the</strong> proportion or<br />
percentage of prediction space, ranging<br />
from 0 <strong>to</strong> 100%, <strong>and</strong> <strong>the</strong> Y axis is <strong>the</strong><br />
range of prediction variance values.<br />
Note: 90 th quantile prediction<br />
variance value is well-suited in a<br />
variety of scenarios.<br />
<strong>Using</strong> <strong>the</strong> crosshair <strong>to</strong>ol shows that 90%<br />
of <strong>the</strong> possible fac<strong>to</strong>r settings have a<br />
relative predictive variance less than<br />
0.91.
Design Evaluations-Design<br />
Diagnostics<br />
Design Diagnostics<br />
These efficiency measures are single<br />
numbers attempting <strong>to</strong> quantify one <strong>design</strong><br />
characteristic. While <strong>the</strong> maximum efficiency<br />
is 100 for any criterion, an efficiency of 100%<br />
is impossible for many <strong>design</strong> problems.<br />
D: Minimize <strong>the</strong> joint confidence<br />
region for regression coefficients.<br />
It is best <strong>to</strong> use <strong>the</strong>se <strong>design</strong> measures <strong>to</strong><br />
compare two competitive <strong>design</strong>s with <strong>the</strong><br />
same model <strong>and</strong> number of runs ra<strong>the</strong>r than<br />
as some absolute measure of <strong>design</strong> quality.<br />
G: Minimize <strong>the</strong> maximum scaled<br />
prediction variance.<br />
A: Minimize <strong>the</strong> sum variances of<br />
all regression coefficients.
Why Augmented Design?<br />
Experimentation is an iterative process, we can not assume<br />
that one successful screening experiment has optimized<br />
our process.<br />
Four common reasons for an unsatisfied experiment:<br />
The specified model is inadequate.<br />
The <strong>results</strong> predicted from <strong>the</strong> experiment are not<br />
reproducible.<br />
Many trials failed.<br />
Important conditions, often an optimum, lie outside <strong>the</strong><br />
experimental region.<br />
27
Motivation for Augmented Design<br />
----Model Inadequacy<br />
The inadequacies of <strong>the</strong> model may be revealed during<br />
<strong>the</strong> analysis of <strong>the</strong> data.<br />
The investigated relationship may be more complicated<br />
than expected.<br />
Inclusion of higher-order polynomial terms<br />
Transformations of fac<strong>to</strong>rs or response<br />
Detection of ‘lurking’ variables.<br />
Ranges of some fac<strong>to</strong>rs are wrong*.<br />
28
Motivation for Augmented Design<br />
----Failing Trials<br />
If many individual trials fail, <strong>the</strong>re may not<br />
be sufficient data <strong>to</strong> estimate <strong>the</strong><br />
parameters of <strong>the</strong> model.<br />
It’s important <strong>to</strong> find out if <strong>the</strong>re is some<br />
technical mishap or whe<strong>the</strong>r <strong>the</strong>re is<br />
something more fundamentally amiss.<br />
29
Motivation for Augmented Design<br />
----Optimum Issues<br />
To define an optimum of <strong>the</strong> response or of some<br />
performance characteristic, this seems <strong>to</strong> lie appreciably<br />
outside <strong>the</strong> present experimental region, experimental<br />
confirmation of this prediction will be necessary.<br />
30
Motivation for Augmented Design<br />
We now consider <strong>the</strong> augmentation of a <strong>design</strong> by <strong>the</strong><br />
addition of a specified number of new trials.<br />
The augmentation includes <strong>the</strong> need for a higher-order<br />
model, a different <strong>design</strong> region, <strong>the</strong> introduction of a<br />
new fac<strong>to</strong>r or deduction of non-important fac<strong>to</strong>rs.<br />
The new <strong>design</strong> will depend on <strong>the</strong> trials for which <strong>the</strong><br />
response is known, although not usually on <strong>the</strong> values<br />
of <strong>the</strong> responses.<br />
31
Examples of Augmented Design<br />
Example A chemical engineer investigates <strong>the</strong> effects of six<br />
fac<strong>to</strong>rs on <strong>the</strong> percent reaction yield of a chemical process. A 2 6-2<br />
fractional fac<strong>to</strong>rial <strong>design</strong> (screening <strong>design</strong>) is implemented with<br />
2 additional center runs.<br />
The data has shown <strong>the</strong> significant curvature for some fac<strong>to</strong>rs,<br />
but <strong>the</strong> collinearity prevents us <strong>to</strong> identify <strong>and</strong> determine <strong>the</strong><br />
correct fac<strong>to</strong>r(s).<br />
The engineers also would like <strong>to</strong> know if it is beneficial <strong>to</strong><br />
increase <strong>the</strong> amount of catalyst <strong>to</strong> have a higher yield from<br />
current concentration of 0.2M.<br />
With 3 slected fac<strong>to</strong>rs <strong>and</strong> catalyst concentrations, <strong>the</strong><br />
augmented <strong>design</strong> with 8 additional runs is used <strong>to</strong> fit <strong>the</strong><br />
response surface model for <strong>the</strong> four fac<strong>to</strong>rs.
Augmented Design in <strong>JMP</strong><br />
Open Augmented <strong>design</strong> 18 Runs.<strong>JMP</strong> in <strong>JMP</strong><br />
Right click red triangle next <strong>to</strong> Screening Run Script<br />
33
Augmented Design in <strong>JMP</strong><br />
34<br />
The screening analysis indicates that <strong>the</strong>re is<br />
curvature effect for X3, but it is aliased<br />
completely with all o<strong>the</strong>r quadratic fac<strong>to</strong>rs<br />
X1 2 <strong>and</strong> X4 2 .
Augmented Design in <strong>JMP</strong><br />
Right click red triangle next <strong>to</strong> Full Fac<strong>to</strong>rial Model with X1, X3 <strong>and</strong><br />
X4 Run Script<br />
The lack of fit<br />
test indicates<br />
that <strong>the</strong> full<br />
fac<strong>to</strong>rial<br />
model is not<br />
adequate.<br />
35
Augmented Design in <strong>JMP</strong><br />
Augmented <strong>design</strong> with different model specification<br />
DOE Augment Design<br />
Click OK.<br />
36
Augmented Design in <strong>JMP</strong><br />
Augmentation Choices Augment<br />
You can change <strong>the</strong><br />
upper <strong>and</strong> lower<br />
limit for new runs,<br />
which would give<br />
you different<br />
prediction region.<br />
Note, catalyst now<br />
is with upper<br />
bound 0.8.<br />
Usually, <strong>the</strong> additional runs would be performed in different<br />
day, we may consider <strong>the</strong> day as blocking effect in <strong>the</strong><br />
model <strong>and</strong> check this box.<br />
37
Augmented Design in <strong>JMP</strong><br />
Specify RSM<br />
Again, <strong>design</strong> is model<br />
dependent, specify <strong>the</strong><br />
correct model is crucial<br />
for a satisfac<strong>to</strong>ry<br />
experiment, in <strong>the</strong><br />
augmented <strong>design</strong>,<br />
usually you would<br />
change some model<br />
terms such as<br />
interactions, quadratic<br />
effects, etc.<br />
38
Augmented Design in <strong>JMP</strong><br />
Choose <strong>the</strong> optimality<br />
criterion <strong>to</strong> be I-<br />
optimal.<br />
Number of Starts <strong>to</strong><br />
be 1000.<br />
R<strong>and</strong>om seed <strong>to</strong> be<br />
1000.<br />
Specify <strong>the</strong> <strong>to</strong>tal number<br />
of runs including original<br />
runs (18+8=26)<br />
Click Make Design.<br />
Click Make Table.<br />
39
Augmented Design in <strong>JMP</strong><br />
New runs<br />
grouped in<br />
block 2.<br />
40
Augmented Design in <strong>JMP</strong><br />
○ Additional Runs<br />
∆ Original Design points<br />
For X1, X3 <strong>and</strong> X4, <strong>the</strong><br />
original runs are generated<br />
on corner <strong>and</strong> center points,<br />
<strong>the</strong> added runs are generated<br />
from axial <strong>and</strong> facial points.
Augmented Design in <strong>JMP</strong><br />
Open Augmented <strong>design</strong> 26 Runs.<strong>JMP</strong> in <strong>JMP</strong><br />
Right click red triangle next <strong>to</strong> Model Run Script<br />
The lack of fit test is not significant<br />
for RS Model.<br />
From <strong>the</strong> effect tests, X1 2 <strong>and</strong> X3 2<br />
quadratic effects are significant <strong>and</strong><br />
catalyst effect is significant at<br />
alpha=0.1.<br />
42
Augmented Design in <strong>JMP</strong><br />
Right click red triangle next <strong>to</strong> Reduced Model Run Script<br />
The lack of fit test is not significant<br />
for RS Model.<br />
All model effects are significant at<br />
alpha=0.05.<br />
43
Augmented Design in <strong>JMP</strong><br />
Right click red triangle next <strong>to</strong> Prediction Profile Maximize Desirability<br />
The predicted yield is maximized at X1=1, X3=1, X4=-1 <strong>and</strong><br />
Catalyst=0.8 with predicted value 27.25. Confirm it as an<br />
additional run.<br />
44
Design from a C<strong>and</strong>idate Set<br />
Motivations for Designs Based on a C<strong>and</strong>idate Set<br />
What <strong>to</strong> consider when using C<strong>and</strong>idate Set<br />
Design<br />
C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />
45
What is a C<strong>and</strong>idate Set?<br />
C<strong>and</strong>idate Set of Design Points - are <strong>the</strong> <strong>to</strong>tal group of<br />
possible data points from which <strong>the</strong> actual <strong>design</strong> points can<br />
be chosen.<br />
For example: <strong>to</strong> construct quantitative structure–activity<br />
relationship (QSAR) models, which help summarize a supposed<br />
relationship between chemical structures <strong>and</strong> biological activity<br />
of chemicals, <strong>the</strong> chemist may search <strong>the</strong> chemical compound<br />
database <strong>to</strong> get some c<strong>and</strong>idate compounds.<br />
A QSAR has <strong>the</strong> form as:<br />
46
Why Design Based on a C<strong>and</strong>idate Set<br />
We may not have full control of experimental<br />
fac<strong>to</strong>rs <strong>and</strong> are limited <strong>to</strong> choice of some fac<strong>to</strong>r<br />
combinations<br />
The <strong>design</strong> space is complex with irregular fac<strong>to</strong>r<br />
settings <strong>and</strong> complicated non-linear fac<strong>to</strong>r<br />
constraints.<br />
As <strong>the</strong> term suggests, C<strong>and</strong>idate Set Design help us<br />
pick <strong>the</strong> best <strong>design</strong> points from <strong>the</strong> c<strong>and</strong>idate set<br />
with respect <strong>to</strong> some criteria.<br />
47
Design from a C<strong>and</strong>idate Set<br />
Example Mitchell (1974a): An animal scientist wants <strong>to</strong> compare wildlife densities<br />
in four different habitats over a year. However, due <strong>to</strong> <strong>the</strong> cost of experimentation,<br />
only 12 observations can be made. The following model is postulated for <strong>the</strong><br />
density in habitat during month :<br />
<br />
This model includes <strong>the</strong> habitat as a classification variable , <strong>the</strong> effect of time<br />
with an overall linear drift term , <strong>and</strong> cyclic behavior in <strong>the</strong> form of a Fourier<br />
series. There is no intercept term in <strong>the</strong> model.<br />
Note, <strong>the</strong>re are 12 Months <strong>and</strong> 4 habitats, we can create<br />
a c<strong>and</strong>idate set with 48 points.
Design from a C<strong>and</strong>idate Set<br />
Open Mitchell.csv in <strong>JMP</strong><br />
Data set contains <strong>the</strong> 48 c<strong>and</strong>idate points <strong>and</strong> includes <strong>the</strong> four cosine variables<br />
(c1, c2, c3, <strong>and</strong> c4) <strong>and</strong> three sine variables (s1, s2, <strong>and</strong>s3).
Design from a C<strong>and</strong>idate Set<br />
<strong>JMP</strong> will not do r<strong>and</strong>omization with C<strong>and</strong>idate Set Design,<br />
do it manually!<br />
Due <strong>to</strong> <strong>the</strong> limitation of <strong>design</strong> space, we may end up with<br />
<strong>design</strong> with severe collinearity<br />
Look at Variation Influence Fac<strong>to</strong>rs (VIF) <strong>to</strong> assess <strong>the</strong><br />
lack of orthogonality
Design from a C<strong>and</strong>idate Set<br />
Be sure <strong>to</strong> assess <strong>the</strong> quality of <strong>the</strong> <strong>design</strong> (e.g., FDS<br />
plots, statistical power – relative variance of coefficients)<br />
Check your final <strong>design</strong> space <strong>to</strong> diagnose possible<br />
problems<br />
Remember that each <strong>design</strong> point in <strong>the</strong> c<strong>and</strong>idate set can<br />
only be selected once
C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />
Change <strong>the</strong> data type of some fac<strong>to</strong>rs in <strong>JMP</strong> file<br />
Right click on Habitat column → Column<br />
Information→ Data Type → Character→OK<br />
You need specify <strong>the</strong> correct data type in this step since we<br />
can’t change this in cus<strong>to</strong>m <strong>design</strong>.
C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />
DOE Cus<strong>to</strong>m Design Add Fac<strong>to</strong>r Covariate<br />
Have <strong>to</strong> select one<br />
fac<strong>to</strong>r at a time in<br />
<strong>JMP</strong> 9.<br />
<strong>JMP</strong> will treat Habitat as original categorical data type in<br />
c<strong>and</strong>idate set file
C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />
Model Specification
C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />
Cus<strong>to</strong>m Design ▼ Optimality Criterion Make D-Optimal Design<br />
Set seed <strong>to</strong> be 193030034.<br />
Set number of start <strong>to</strong> be 100.<br />
Specify 12 in Number<br />
of Runs<br />
Click Make Design<br />
Click Make Table
C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />
C<strong>and</strong>idate set <strong>design</strong> <strong>and</strong> evaluations<br />
Non-r<strong>and</strong>omized<br />
Note for this <strong>design</strong>, we actually can not do<br />
r<strong>and</strong>omization due <strong>to</strong> <strong>the</strong> property of Month fac<strong>to</strong>r. Let’s<br />
assume it can be r<strong>and</strong>omized for illustration purpose.
C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />
To do r<strong>and</strong>omization <strong>and</strong> VIFs calculation manually<br />
in <strong>JMP</strong> you need response data<br />
Right click on Y column → Formula<br />
Scroll in <strong>the</strong> “Functions” box, choose R<strong>and</strong>om →<br />
R<strong>and</strong>om Uniform, <strong>the</strong>n click <strong>the</strong> OK but<strong>to</strong>n<br />
Right click on Y column → Sort<br />
▼Red Triangle (next <strong>to</strong> “Model” in <strong>the</strong> upper-left panel of<br />
<strong>the</strong> data table) → Run Script<br />
Check No intercept<br />
Click <strong>the</strong> Run Model but<strong>to</strong>n<br />
Scroll <strong>to</strong> <strong>the</strong> Parameter Estimates section<br />
Right click on <strong>the</strong> table → Columns → VIF
C<strong>and</strong>idate Set Design in <strong>JMP</strong><br />
Run <strong>the</strong> <strong>experiments</strong><br />
according <strong>to</strong> this<br />
r<strong>and</strong>omized order.<br />
VIFs
More capabilities of <strong>JMP</strong><br />
1. Split-Plot/Split-Split-Plot Design<br />
2. Saturated <strong>and</strong> Supersaturated Design<br />
3. Mixture Design<br />
4. Choice Design Space<br />
5. Non-linear Design<br />
6. Space Filling Design / Design for Computer<br />
Experiments
Workshop<br />
Scenario 1: Suppose an experimenter is investigating <strong>the</strong> properties of a<br />
particular adhesive. is <strong>the</strong> amount of adhesive, is <strong>the</strong> cure temperature .<br />
The prior knowledge is:<br />
If <strong>to</strong>o little adhesive <strong>and</strong> <strong>to</strong>o low cure temperature, <strong>the</strong> parts will not bond.<br />
. <br />
If both fac<strong>to</strong>rs are at high levels, <strong>the</strong> parts will be ei<strong>the</strong>r damaged by heat stress<br />
or an inadequate bond will result.<br />
<br />
The model of interest is a non-st<strong>and</strong>ard model, i.e.,<br />
<br />
Also due <strong>to</strong> <strong>the</strong> budget constraint, only 30 runs can be offered.
Workshop<br />
Scenario 2: Meyer, et al. (1996), demonstrates how <strong>to</strong> use <strong>the</strong> augment<br />
<strong>design</strong>er in <strong>JMP</strong> <strong>to</strong> resolve ambiguities left by a screening <strong>design</strong>. In this study,<br />
a chemical engineer investigates <strong>the</strong> effects of five fac<strong>to</strong>rs on <strong>the</strong> percent<br />
reaction of a chemical process.<br />
To begin, open Reac<strong>to</strong>r 8 Runs.jmp, <strong>and</strong> augment this <strong>design</strong> with additional 8<br />
runs <strong>to</strong> incorporate All Two-Fac<strong>to</strong>r Interactions.<br />
Set seed <strong>to</strong> be 12834729.<br />
After you create <strong>the</strong> <strong>design</strong>, you can open Reac<strong>to</strong>r Augment<br />
Data.jmp <strong>to</strong> do <strong>the</strong> variable selection using stepwise regression.<br />
Note: Choose P-value Threshold from <strong>the</strong> S<strong>to</strong>pping Rule menu, Mixed from<br />
<strong>the</strong> Direction menu, <strong>and</strong> make sure Prob <strong>to</strong> Enter is 0.050 <strong>and</strong> Prob <strong>to</strong><br />
Leave is 0.100. These are not <strong>the</strong> default values.
Workshop<br />
Scenario 3: An au<strong>to</strong>motive engineer wants <strong>to</strong> fit a quadratic (Response<br />
Surface) model <strong>to</strong> fuel consumption data in order <strong>to</strong> find <strong>the</strong> values of <strong>the</strong><br />
control variables that minimize fuel consumption (refer <strong>to</strong> Vance 1986). The<br />
three control variables AFR (air fuel ratio), EGR (exhaust gas recirculation),<br />
<strong>and</strong> SA(spark advance) <strong>and</strong> <strong>the</strong>ir possible settings are shown in <strong>the</strong> following<br />
table:<br />
Variable Values<br />
AFR 15 16 17 18<br />
EGR 0.020 0.177 0.377 0.566 0.921 1.117<br />
SA 10 16 22 28 34 40 46 52<br />
Ra<strong>the</strong>r than run all 192 (4×6×8) combinations of <strong>the</strong>se fac<strong>to</strong>rs (saved as<br />
c<strong>and</strong>idate set workshop 3.csv), <strong>the</strong> engineer would like <strong>to</strong> see whe<strong>the</strong>r <strong>the</strong><br />
<strong>to</strong>tal number of runs can be reduced <strong>to</strong> 50 in an optimal fashion.
Design Evaluations:<br />
The Signal <strong>to</strong> Noise Ratio (often abbreviated SNR or S/N) is a measure<br />
used in science <strong>and</strong> engineering <strong>to</strong> quantify how much a signal has been<br />
corrupted by noise.<br />
It is defined as <strong>the</strong> ratio of signal power <strong>to</strong> <strong>the</strong> noise power corrupting <strong>the</strong><br />
signal. A ratio higher than 1:1 indicates more signal than noise.<br />
In less technical terms, signal-<strong>to</strong>-noise ratio compares <strong>the</strong> level of a<br />
desired signal (such as music) <strong>to</strong> <strong>the</strong> level of background noise. The<br />
higher <strong>the</strong> ratio, <strong>the</strong> less obtrusive <strong>the</strong> background noise is.<br />
In statistical analysis, signal in SNR is defined as <strong>the</strong> regression<br />
coefficient of model terms, noise is defined as experimental error (Model<br />
error) in terms of st<strong>and</strong>ard deviation.
Design Evaluations:<br />
The Power column shows <strong>the</strong><br />
power of <strong>the</strong> <strong>design</strong> as specified <strong>to</strong><br />
detect effects of a certain size<br />
(SNR) at given significance level.<br />
Here, assume <strong>the</strong> model error std.<br />
dev. ()=2.5, <strong>the</strong> true coefficient<br />
value of 2 for X2 is 2.5, <strong>the</strong>n <strong>the</strong><br />
SNR=2.5/2.5=1.0, <strong>the</strong> probability<br />
(Power) <strong>to</strong> identify such effect is<br />
0.402 at significance level 0.05.<br />
In <strong>JMP</strong>, you can change <strong>the</strong> SNR setting (e.g. Signal <strong>to</strong><br />
Noise=1 i.e. 2 /=1) <strong>and</strong> significance level (.05).
Design Evaluations:<br />
Design – Power of <strong>the</strong> Design<br />
e.x. If we consider <strong>the</strong> coefficient of<br />
Extraction Temp is sharp with respect<br />
<strong>to</strong> <strong>the</strong> r<strong>and</strong>om noise, say 2 is twice<br />
( 2 =5.0) as large as <strong>the</strong> Noise, i.e.<br />
SNR ( =2), we have more power<br />
<strong>to</strong> detect it (0.909 vs 0.402)
Design Evaluations:<br />
Significance Level increases, <strong>the</strong> Power increase.<br />
Signal <strong>to</strong> Noise Ratio increases, <strong>the</strong> Power increase.<br />
Note: If your <strong>design</strong> turns out <strong>to</strong> have very low power<br />
with even large Signal <strong>to</strong> Noise ratio settings, <strong>the</strong>n one<br />
needs <strong>to</strong> question whe<strong>the</strong>r it is worth running <strong>the</strong><br />
experiment!