Exploring Best Practices in the Design of Experiments - JMP
Exploring Best Practices in the Design of Experiments - JMP
Exploring Best Practices in the Design of Experiments - JMP
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Explor<strong>in</strong>g</strong> <strong>Best</strong> <strong>Practices</strong> <strong>in</strong> <strong>the</strong><br />
<strong>Design</strong> <strong>of</strong> <strong>Experiments</strong><br />
Chris Nachtsheim<br />
University <strong>of</strong> M<strong>in</strong>nesota<br />
Carlson School <strong>of</strong> Management<br />
and<br />
School <strong>of</strong> Statistics<br />
4/29/2013
Jo<strong>in</strong>t work with Brad Jones<br />
SAS/<strong>JMP</strong><br />
4/29/2013
Overview<br />
• Where have we been?<br />
• Where are we now?<br />
• Emerg<strong>in</strong>g paradigm: problem-driven design<br />
• Five Examples<br />
• What’s driv<strong>in</strong>g <strong>the</strong> s<strong>of</strong>tware?<br />
• Def<strong>in</strong>itive screen<strong>in</strong>g and an example<br />
• The future?<br />
4/29/2013
Where have we been?<br />
4/29/2013
Where have we been?<br />
• 10 th Century: Rhazes<br />
• Hospital director <strong>in</strong> Baghdad<br />
• First cl<strong>in</strong>ical trial---efficacy <strong>of</strong><br />
bloodlett<strong>in</strong>g on men<strong>in</strong>gitis<br />
4/29/2013
Where have we been?<br />
• Avicenna: Eleventh century<br />
• Seven rules for medical<br />
experimentation, <strong>in</strong>clud<strong>in</strong>g<br />
• Vary one factor at a time<br />
• Need for controls and replication<br />
• Use <strong>of</strong> multiple levels <strong>of</strong> a<br />
treatment<br />
4/29/2013
Where have we been?<br />
• James L<strong>in</strong>d, 1753: “A Treatise on<br />
Scurvy”<br />
• First (published) one-way layout<br />
4/29/2013
From “A Treatise…”<br />
“On <strong>the</strong> 20th May, 1747, I took twelve patients <strong>in</strong> <strong>the</strong> scurvy on board <strong>the</strong><br />
Salisbury at sea. Their cases were as similar as I could have <strong>the</strong>m. They all <strong>in</strong><br />
general had putrid gums, <strong>the</strong> spots and lassitude, with weakness <strong>of</strong> <strong>the</strong>ir knees.<br />
…<br />
•Two <strong>of</strong> <strong>the</strong>se were ordered each a quart <strong>of</strong> cyder a day…<br />
•Two o<strong>the</strong>rs took twenty five gutts <strong>of</strong> elixir vitriol three times a day upon an<br />
empty stomach…<br />
•Two o<strong>the</strong>rs took two spoonfuls <strong>of</strong> v<strong>in</strong>egar three times a day upon an empty<br />
stomach…<br />
•Two <strong>of</strong> <strong>the</strong> worst patients, with <strong>the</strong> tendons <strong>in</strong> <strong>the</strong> ham rigid (a symptom none<br />
<strong>the</strong> rest had) were put under a course <strong>of</strong> sea water…<br />
•Two o<strong>the</strong>rs had each two oranges and one lemon given <strong>the</strong>m every day…<br />
•The two rema<strong>in</strong><strong>in</strong>g patients took <strong>the</strong> bigness <strong>of</strong> a nutmeg three times a day…<br />
The consequence was that <strong>the</strong> most sudden and visible good effects were<br />
perceived from <strong>the</strong> use <strong>of</strong> <strong>the</strong> oranges and lemons”<br />
4/29/2013
Where have we been?<br />
• Gergonne: 1815<br />
• <strong>Design</strong>s for polynomial regression,<br />
response surface design<br />
• S. C. Peirce: 1870s<br />
• Randomization<br />
• K. Smith, 1918: Biometrika, Optimal design for polynomial<br />
regression<br />
4/29/2013
R. A. Fisher put it all toge<strong>the</strong>r<br />
Fisher, 1920s:<br />
• Randomization as ma<strong>the</strong>matical basis for analysis<br />
• Local control and block<strong>in</strong>g<br />
• Replication<br />
• Factorial designs<br />
• Split plot designs<br />
• Confound<strong>in</strong>g<br />
• ANOVA<br />
• F, t distributions<br />
• Etc., etc.<br />
4/29/2013
1920’s-1950’s: Orthogonality is <strong>the</strong> driv<strong>in</strong>g pr<strong>in</strong>ciple<br />
• Fisher, Yates: need for ease <strong>of</strong> computation,<br />
<strong>in</strong>dependence <strong>of</strong> effects<br />
• R. C. Bose, C.R. Rao, and Indian School:<br />
Comb<strong>in</strong>atorics, BIBDs, PBIBDs<br />
• F<strong>in</strong>ney, 1945: Fractional replication<br />
• Plackett and Burman, 1946<br />
4/29/2013
…culm<strong>in</strong>at<strong>in</strong>g <strong>in</strong> <strong>the</strong> 2 k-p System<br />
4/29/2013
1950s: Baby steps away from orthogonality<br />
Also Box and Lucas, 1959, Nonl<strong>in</strong>ear <strong>Design</strong><br />
4/29/2013
Optimal <strong>Design</strong>—Pioneers<br />
• Kiefer and Wolfowitz (1959)<br />
• Fedorov (1969)<br />
• Wynn (1969)<br />
4/29/2013<br />
14
Shift to problem-driven design<br />
(70s, 80s 90s, 00s)<br />
• Toby Mitchell---Detmax<br />
• Constra<strong>in</strong>ed mixture designs—John Cornell,<br />
Ron Snee, Greg Piepel<br />
• <strong>Design</strong> augmentation<br />
• Optimal block<strong>in</strong>g (Cook, Nachtsheim ’87, Donev, Atk<strong>in</strong>son ’87)<br />
• Model robust design (Lauter, 1974)<br />
• Bayesian design (Chaloner, ‘80, DuMouchel and Jones,’94<br />
• Optimal split-plot designs,<br />
Goos, ‘2002<br />
4/29/2013<br />
15
Where are we now?<br />
4/29/2013
Paradigms for DOE<br />
Classical:<br />
• Fisher paradigm<br />
• Textbook paradigm<br />
• duPont paradigm<br />
Optimal:<br />
• Problem-driven paradigm<br />
4/29/2013
The Classical Approach to DOE<br />
1. State <strong>the</strong> objectives and your constra<strong>in</strong>ts<br />
2. F<strong>in</strong>d a design from a book or a table or a s<strong>of</strong>tware<br />
program <strong>the</strong> most closely matches your objectives<br />
(requires DOE expertise)<br />
3. Implement <strong>the</strong> design<br />
4. Analyze results<br />
4/29/2013
Classical design paradigms pros, cons<br />
Pros<br />
1. <strong>Design</strong>s are tried & true<br />
2. Easy analysis<br />
Cons<br />
1. Requires much expertise<br />
2. Rigidity: designs <strong>of</strong>ten do<br />
not match <strong>the</strong> problem<br />
4/29/2013
What DOE Expertise is Needed?<br />
• Full factorial designs<br />
• Interactions<br />
• 2 k factorial designs<br />
• 2 k-p fractional factorial designs<br />
• Confound<strong>in</strong>g schemes and alias<strong>in</strong>g<br />
• Resolution<br />
• Screen<strong>in</strong>g designs (Plackett<br />
Burman, Res III FF)<br />
4/29/2013<br />
OK, What else?
What DOE Expertise (cont<strong>in</strong>ued)?<br />
• Response surface experiments<br />
• Blocked experiments<br />
• Nested designs<br />
• Repeated measures and split plot<br />
designs<br />
• Incomplete block designs<br />
• Mixture experiments and mixture<br />
models<br />
• Supersaturated designs<br />
4/29/2013
When might classical designs fail to match problem?<br />
• Multi-level designs: some factors at 2 levels, some at 3,<br />
some at four<br />
• Constra<strong>in</strong>ed designs: Can’t run <strong>the</strong> factors simultaneously<br />
at <strong>the</strong> high levels, chemical process will blow up <strong>the</strong> plant!<br />
• Some factors are qualitative, some not<br />
• I have a mixture experiment---factor levels are<br />
percentages that must add to one<br />
• My block sizes cannot be <strong>the</strong> same.<br />
• I want to augment an exist<strong>in</strong>g design.<br />
• Etc., etc., etc<br />
4/29/2013<br />
22
New Paradigm: Problem-driven design<br />
4/29/2013
Problem-Driven Paradigm<br />
• With good s<strong>of</strong>tware, approach easy to use and to teach.<br />
• One, unified approach to (nearly) all design problems:<br />
1. Describe your responses<br />
2. Describe your factors<br />
3. Describe your objectives (model terms)<br />
4. Describe your constra<strong>in</strong>ts and budget (n)<br />
5. Press button to create <strong>the</strong> design<br />
6. Check diagnostics, do sensitivity analysis<br />
7. Manually alter as with classical<br />
4/29/2013<br />
24
Problem-driven paradigm pros, cons:<br />
Pros<br />
1. <strong>Design</strong> tailored to problem<br />
2. One unified approach to<br />
nearly all problems<br />
Cons<br />
1. No textbook advocat<strong>in</strong>g<br />
computer-aided design<br />
2. Technical concerns from<br />
<strong>the</strong> stats community<br />
4/29/2013
Five Illustrations with <strong>JMP</strong><br />
4/29/2013
Example 1: Extract<strong>in</strong>g food solids from peanuts <strong>in</strong><br />
solution, n = 18<br />
Factor Low High<br />
1 Water pH level 6.95 8.0<br />
2 Water temp 20C 60C<br />
3 Extraction time 15 40<br />
4 Water-Peanuts Ratio 5 9<br />
5 Agitation speed 5,000 10,000<br />
6 Hydrolyzed? N Y<br />
7 Presoak<strong>in</strong>g? N Y<br />
4/29/2013
Classical Approach: 2 7-3 Resolution IV FF<br />
Problem:<br />
•Seven two-level factors <strong>in</strong> 16 runs.<br />
•Ma<strong>in</strong> effects are critical<br />
•Partial <strong>in</strong>formation about two-factor <strong>in</strong>teractions is<br />
desirable.<br />
•Standard approach: Resolution IV fractional factorial<br />
design<br />
4/29/2013
Knowledge Requirements:<br />
• Ma<strong>in</strong> effects, <strong>in</strong>teractions (model terms)<br />
• Confound<strong>in</strong>g schemes, alias<strong>in</strong>g<br />
• Resolution<br />
• Fractional factorials<br />
• Plackett-Burman designs<br />
• Block<strong>in</strong>g (maybe)<br />
4/29/2013
Contrast: Custom <strong>Design</strong> Approach<br />
Knowledge Requirements:<br />
1. Ma<strong>in</strong> effects, <strong>in</strong>teractions (model terms)<br />
2. Estimability level <strong>of</strong> model terms---ei<strong>the</strong>r<br />
“necessary” or “if possible”<br />
4/29/2013
Here’s how it’s done:<br />
4/29/2013
<strong>JMP</strong> Demo:<br />
4/29/2013
Example 2: Maximize recall <strong>of</strong> TV commercial<br />
Factors:<br />
A: Length <strong>of</strong> commercial (10 to 30 seconds)<br />
B: Number <strong>of</strong> repetitions (1 to 3)<br />
C: Amount <strong>of</strong> time s<strong>in</strong>ce last view<strong>in</strong>g (1 to 5 days)<br />
4/29/2013
Classical <strong>Design</strong> Knowledge Requirements:<br />
• Ma<strong>in</strong> effects, <strong>in</strong>teractions, second-order powers (model<br />
terms)<br />
• Confound<strong>in</strong>g and resolution (for corners)<br />
• Fractional factorials (for corners)<br />
• CCDs, Box-Behnken designs<br />
• Block<strong>in</strong>g (sort <strong>of</strong>)<br />
• Axial po<strong>in</strong>ts<br />
4/29/2013
Contrast: Problem-Driven <strong>Design</strong> Approach<br />
Knowledge Requirements:<br />
1. Ma<strong>in</strong> effects, <strong>in</strong>teractions, quadratic terms<br />
(RSM button)<br />
4/29/2013
<strong>JMP</strong> Demo:<br />
4/29/2013
Example 3: Supersaturated <strong>Design</strong>s<br />
Problem:<br />
• 20 two-level factors (yikes)<br />
• Block size = 5 (how odd)<br />
• n = 15<br />
What?????<br />
Experiment with fewer runs than factors????<br />
4/29/2013
BAH…HAHAHAHAHAHAHHAHAHAHAHAHA<br />
4/29/2013
You can’t do that!<br />
1. <strong>Design</strong> matrix is<br />
s<strong>in</strong>gular so multiple<br />
regression fails.<br />
2. Factor alias<strong>in</strong>g is<br />
complex.<br />
3. “You can’t get<br />
someth<strong>in</strong>g for noth<strong>in</strong>g.”<br />
4/29/2013
A more general def<strong>in</strong>ition…<br />
• Supersaturated designs have fewer runs than parameters<br />
<strong>of</strong> <strong>in</strong>terest.<br />
4/29/2013
FF designs are supersaturated designs!<br />
1. Add<strong>in</strong>g center po<strong>in</strong>ts to 2-level factorial designs.<br />
4/29/2013
O<strong>the</strong>rs:<br />
1. Resolution III: Supersaturated with respect to <strong>the</strong><br />
model conta<strong>in</strong><strong>in</strong>g all <strong>the</strong> two-factor <strong>in</strong>teraction<br />
effects.<br />
2. Resolution IV: Supersaturated with respect to <strong>the</strong><br />
model conta<strong>in</strong><strong>in</strong>g all <strong>the</strong> two-factor <strong>in</strong>teraction<br />
effects.<br />
3. Resolution V: Supersaturated with respect to<br />
models conta<strong>in</strong><strong>in</strong>g any three-factor or higher order<br />
<strong>in</strong>teraction.<br />
4/29/2013
Problem-Driven <strong>Design</strong> Approach<br />
Knowledge Requirements:<br />
• Estimability levels<br />
• Use <strong>of</strong> stepwise (or o<strong>the</strong>r approaches) for analysis <strong>of</strong><br />
supersaturated designs<br />
4/29/2013
20 Factors, Not all One ma<strong>in</strong> block<strong>in</strong>g 12 ma<strong>in</strong> factor effects are “necessary”<br />
We will make <strong>the</strong>m all “if possible” here<br />
4/29/2013
All experimental Not all ma<strong>in</strong> factors 12 ma<strong>in</strong> “secondary” effects are “necessary”<br />
We will make <strong>the</strong>m all “if possible” here<br />
4/29/2013
Result: Highly efficient supersaturated design<br />
Jones, L<strong>in</strong>, Nachtsheim “Bayesian Supersaturated <strong>Design</strong>s” (JSPI, 2006)<br />
4/29/2013
Color-coded correlation matrix has <strong>the</strong> blues!<br />
4/29/2013
<strong>JMP</strong> Demo: The Famous Supersaturated Card Trick<br />
4/29/2013
Example 4:<br />
Sheet Alum<strong>in</strong>um Roll<strong>in</strong>g Mill<br />
Scrap<br />
Alum<strong>in</strong>um<br />
Furnace<br />
Oil-Water Coolant<br />
F<strong>in</strong>ish<br />
Temperature<br />
Molten<br />
Alum<strong>in</strong>um<br />
Mill 1<br />
Depth<br />
…<br />
…<br />
Mill 3<br />
Depth<br />
Quality<br />
(F<strong>in</strong>ish)<br />
Rat<strong>in</strong>g<br />
To Coil<strong>in</strong>g/<br />
Shipp<strong>in</strong>g<br />
4/29/2013
Textbook approach to Alum<strong>in</strong>um Mill<strong>in</strong>g<br />
Factor<br />
Low High<br />
1 Mill depth 1 0.1 0.8<br />
2 Mill depth 2 0.1 0.8<br />
3 Mill depth 3 0.1 0.8<br />
4 Spray volume L H<br />
5 Spray oil content L H<br />
4/29/2013
But <strong>the</strong>re is a problem:<br />
We have constra<strong>in</strong>ed mill depths (MDs):<br />
None <strong>of</strong> <strong>the</strong> tabled designs work!<br />
4/29/2013
<strong>JMP</strong> Demo: All mill depths will sum to 1.0.<br />
4/29/2013
DOE to <strong>the</strong> rescue!<br />
• Result: When mill depth 3<br />
<strong>in</strong>creased, it improved <strong>the</strong><br />
quality <strong>of</strong> <strong>the</strong> surface<br />
Scrap reduced from 50% to 10%;<br />
Company saved!<br />
4/29/2013
Split Plots:<br />
DOE with Easy and Hard to Change Factors<br />
Very common <strong>in</strong> <strong>in</strong>dustrial<br />
experiments!<br />
Can trip up even<br />
pr<strong>of</strong>essional statisticians<br />
“All <strong>in</strong>dustrial experiments<br />
are split-plot experiments”<br />
--Cuthbert Daniel<br />
4/29/2013
Confession <strong>of</strong> a Text Book Author<br />
• I have only recently appreciated how effective split plot<br />
experiments really are.<br />
• Thorough treatment should be standard <strong>in</strong> any<br />
<strong>in</strong>troductory design course, any text, any statistics<br />
package.<br />
4/29/2013
Confession <strong>of</strong> a Text Book Author<br />
• I have only recently appreciated how effective split plot<br />
experiments really are.<br />
• Thorough treatment should be standard <strong>in</strong> any<br />
<strong>in</strong>troductory design course, any text, any statistics<br />
package.<br />
• They merited a whopp<strong>in</strong>g three pages <strong>in</strong> my text book!<br />
4/29/2013
Recent Paper---My Conversion<br />
4/29/2013
So What is a Split-Plot <strong>Design</strong> and Why <strong>the</strong> Name?<br />
Invented by<br />
Fisher for<br />
Agricultural<br />
Applications<br />
4/29/2013
Example 5: Corrosion resistance <strong>of</strong> steel bars<br />
(Box, Hunter, and Hunter text book)<br />
DOE with two factors:<br />
1. Furnace (cur<strong>in</strong>g) temperatures<br />
(360 o C, 370 o C, 380 o C)<br />
2. Coat<strong>in</strong>g type (C 1 , C 2 , C 3 , C 4 )<br />
4/29/2013
Easy-to-change and hard-to-change factors<br />
Problem: Not all factors are NOT created equally!<br />
Furnace temperature:<br />
Hard to change!<br />
Coat<strong>in</strong>g type:<br />
Easy to change!<br />
4/29/2013
Actual Corrosion Resistance DOE<br />
Whole-Plots (H2C)<br />
Split-Plots (E2C)<br />
Furnace Run Temperature Coat<strong>in</strong>g <strong>in</strong> randomized order<br />
1 360 C2 C1 C3 C4<br />
2 380 C1 C3 C4 C2<br />
3 370 C3 C1 C2 C4<br />
4 380 C4 C3 C2 C1<br />
5 370 C4 C1 C3 C2<br />
6 360 C1 C4 C2 C3<br />
Advantage: n = 24 but only 6 furnace runs!<br />
4/29/2013
Key Split Plot DOE Characteristic<br />
Two randomizations:<br />
Step 1: Randomly assign temps to furnace runs<br />
Step 2: For each furnace run…<br />
Randomly assign coat<strong>in</strong>gs to four subgroups <strong>of</strong> bars<br />
with<strong>in</strong> <strong>the</strong> furnace run<br />
4/29/2013
Contrast with ord<strong>in</strong>ary DOE<br />
In usual DOE: just<br />
one randomization:<br />
3x4 factorial with an<br />
added replicate<br />
requires 24 furnace<br />
runs<br />
(Only one coat<strong>in</strong>g<br />
type per furnace run)<br />
4/29/2013
All-Too-Common Mistake<br />
Step 1.<br />
Step 2.<br />
Step 3.<br />
Create ord<strong>in</strong>ary randomized DOE<br />
Sort <strong>the</strong> runs by <strong>the</strong> hard-to-change factor<br />
and run <strong>in</strong> that order<br />
(DOE is actually now a split plot design)<br />
Analyze as though a randomized DOE<br />
4/29/2013
Analysis depends on <strong>the</strong> randomization<br />
Correct analysis Factor p-value Significant?<br />
Temperature 0.209 No Right!<br />
Coat<strong>in</strong>gs 0.002 Yes Right!<br />
Temp x Coat<strong>in</strong>gs 0.024 Yes Right!<br />
Usual analysis Factor p-value Significant?<br />
Temperature 0.003 Yes Wrong!<br />
Coat<strong>in</strong>gs 0.386 No Wrong!<br />
Temp x Coat<strong>in</strong>gs 0.852 No Wrong!<br />
4/29/2013
<strong>JMP</strong> Demo<br />
4/29/2013
Take-Aways<br />
1. Always consider runn<strong>in</strong>g <strong>in</strong>dustrial designs as<br />
split-plots<br />
1. Lose <strong>in</strong>fo on whole plot ma<strong>in</strong> effects, but…<br />
1. Ga<strong>in</strong> <strong>in</strong>fo on split-plot factors and split-plot by<br />
whole-plot factor <strong>in</strong>teractions<br />
2. Overall efficiency can be better than CRD<br />
1. Teach <strong>the</strong>se methods <strong>in</strong> workshops on DOE<br />
4/29/2013
What’s driv<strong>in</strong>g <strong>the</strong> s<strong>of</strong>tware?<br />
4/29/2013
What’s driv<strong>in</strong>g <strong>the</strong> s<strong>of</strong>tware?<br />
Optimal design construction, optimal block<strong>in</strong>g<br />
• Coord<strong>in</strong>ate exchange algorithm, Meyer and Nachtsheim,<br />
Technometrics, 1995<br />
• Optimal block<strong>in</strong>g, Cook, Nachtsheim, Technometrics, ‘1987<br />
Bayesian D-Optimal <strong>Design</strong>s<br />
• DuMouchel and Jones, Technometrics, 1994<br />
• Jones, L<strong>in</strong>, Nachtsheim,. Journal <strong>of</strong> Statistical Plann<strong>in</strong>g and<br />
Inference, 2006<br />
Optimal Split-Plot and Split-Split Plot <strong>Design</strong>s<br />
• Goos and Jones, Biometrika, 2009<br />
4/29/2013
Optimal designs depend on a model!<br />
OH YEAH?<br />
1. No more so than classical designs.<br />
2. They should!<br />
3. If you’re go<strong>in</strong>g to lose sleep, see model<br />
robustness, Bayesian, multi-objective design<br />
literature<br />
4/29/2013
Lots <strong>of</strong> work <strong>in</strong> Model Robust and<br />
Multi-objective <strong>Design</strong><br />
Randomly chosen examples<br />
1. Li/Nachtsheim, 2000, Model Robust Factorial<br />
<strong>Design</strong>s<br />
2. DuMouchel/Jones, 1994, Bayesian D-optimal<br />
<strong>Design</strong>s<br />
3. Jones/Nachtsheim, 2011 M<strong>in</strong>imal Alias<strong>in</strong>g <strong>Design</strong>s<br />
4/29/2013
Def<strong>in</strong>itive Screen<strong>in</strong>g <strong>Design</strong>s<br />
4/29/2013
Multi-objective designs: six-factor example, n = 12<br />
• Standard choice: Placket-Burman design<br />
• Plackett-Burman designs have “complex alias<strong>in</strong>g <strong>of</strong> <strong>the</strong> ma<strong>in</strong><br />
effects by two-factor <strong>in</strong>teractions.<br />
4/29/2013
Alias Matrix for PB design<br />
Full Model: 1 + 6 + 15 = 22 terms<br />
Model we can fit:<br />
Leads to bias:<br />
Alias matrix:<br />
4/29/2013<br />
74
Alias Matrix <strong>of</strong> Plackett-Burman design<br />
PB (non-regular) design has “complex alias<strong>in</strong>g”<br />
4/29/2013
4/29/2013<br />
4/29/2013<br />
If only <strong>the</strong>re were ano<strong>the</strong>r design<br />
with this alias matrix:<br />
Intercept<br />
A<br />
B<br />
C<br />
D<br />
E<br />
F<br />
Effect<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
A*B<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
A*C<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
A*D<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
A*E<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
A*F<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
B*C<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
B*D<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
B*E<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
B*F<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
C*D<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
C*E<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
C*F<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
D*E<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
D*F<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
0<br />
E*F<br />
Alias Matrix
Turns out <strong>the</strong>re is:<br />
Six foldover<br />
pairs<br />
Run A B C D E F<br />
1 0 1 -1 -1 -1 -1<br />
2 0 -1 1 1 1 1<br />
3 1 0 -1 1 1 -1<br />
4 -1 0 1 -1 -1 1<br />
5 -1 -1 0 1 -1 -1<br />
6 1 1 0 -1 1 1<br />
7 -1 1 1 0 1 -1<br />
8 1 -1 -1 0 -1 1<br />
9 1 -1 1 -1 0 -1<br />
10 -1 1 -1 1 0 1<br />
11 1 1 1 1 -1 0<br />
12 -1 -1 -1 -1 1 0<br />
13 0 0 0 0 0 0<br />
4/29/2013
Def<strong>in</strong>itive Screen<strong>in</strong>g <strong>Design</strong> for 6 factors<br />
Center po<strong>in</strong>t <strong>in</strong><br />
each row<br />
Run A B C D E F<br />
1 0 1 -1 -1 -1 -1<br />
2 0 -1 1 1 1 1<br />
3 1 0 -1 1 1 -1<br />
4 -1 0 1 -1 -1 1<br />
5 -1 -1 0 1 -1 -1<br />
6 1 1 0 -1 1 1<br />
7 -1 1 1 0 1 -1<br />
8 1 -1 -1 0 -1 1<br />
9 1 -1 1 -1 0 -1<br />
10 -1 1 -1 1 0 1<br />
11 1 1 1 1 -1 0<br />
12 -1 -1 -1 -1 1 0<br />
13 0 0 0 0 0 0<br />
4/29/2013
Def<strong>in</strong>itive Screen<strong>in</strong>g <strong>Design</strong> for 6 factors<br />
One overall<br />
center po<strong>in</strong>t<br />
Run A B C D E F<br />
1 0 1 -1 -1 -1 -1<br />
2 0 -1 1 1 1 1<br />
3 1 0 -1 1 1 -1<br />
4 -1 0 1 -1 -1 1<br />
5 -1 -1 0 1 -1 -1<br />
6 1 1 0 -1 1 1<br />
7 -1 1 1 0 1 -1<br />
8 1 -1 -1 0 -1 1<br />
9 1 -1 1 -1 0 -1<br />
10 -1 1 -1 1 0 1<br />
11 1 1 1 1 -1 0<br />
12 -1 -1 -1 -1 1 0<br />
13 0 0 0 0 0 0<br />
4/29/2013
How did we f<strong>in</strong>d this design?*<br />
4/29/2013<br />
*Jones, Nachtsheim, Technometrics, 2011
Now generalize this structure for m factors<br />
Can we f<strong>in</strong>d<br />
great designs<br />
for any m?<br />
4/29/2013
It turns out <strong>the</strong>re is a “Conference Matrix” solution<br />
An mxm square matrix C with 0 diagonal and +1 or -1 <strong>of</strong>f<br />
diagonal elements such that:<br />
•Arose <strong>in</strong> connection with a problem <strong>in</strong> telephony and first<br />
described and named by Vitold Belevitch<br />
•Belevitch was <strong>in</strong>terested <strong>in</strong> construct<strong>in</strong>g ideal telephone<br />
conference networks from ideal transformers and discovered<br />
that such networks were represented by C.<br />
4/29/2013
Conference Matrices & Telephony<br />
From<br />
Wikipedia<br />
article on<br />
Conference<br />
Matrices<br />
4/29/2013
Conference Matrix <strong>of</strong> Order 6<br />
C =<br />
4/29/2013
Here <strong>the</strong> amaz<strong>in</strong>g result:<br />
Form <strong>the</strong> augmented matrix:<br />
…and you get an orthogonal (for ma<strong>in</strong> effects)<br />
def<strong>in</strong>itive screen<strong>in</strong>g design!<br />
4/29/2013
<strong>Design</strong> Properties<br />
1. The number <strong>of</strong> required runs is only one more<br />
than twice <strong>the</strong> number <strong>of</strong> factors.<br />
2. Unlike Plackett-Burman and Resolution III and IV<br />
fractional factorial designs, ma<strong>in</strong> effects are<br />
completely <strong>in</strong>dependent <strong>of</strong> two-factor<br />
<strong>in</strong>teractions.<br />
3. Unlike all predecessors, <strong>the</strong>se are three-level<br />
designs, so we can estimate curvatures!<br />
4. <strong>Design</strong>s are capable <strong>of</strong> estimat<strong>in</strong>g all possible<br />
full quadratic models <strong>in</strong>volv<strong>in</strong>g three or fewer<br />
factors with very high levels <strong>of</strong> statistical<br />
efficiency.<br />
4/29/2013<br />
86
Impact? A break-through solution for sequester<strong>in</strong>g<br />
greenhouse gasses<br />
4/29/2013
Impact? First published DSD case study<br />
4/29/2013
Impact? From <strong>the</strong> conclusions:<br />
“Def<strong>in</strong>itive-screen<strong>in</strong>g designs were used to efficiently<br />
select a model describ<strong>in</strong>g <strong>the</strong> formulation <strong>of</strong> a prote<strong>in</strong><br />
under cl<strong>in</strong>ical development. The ability <strong>of</strong> <strong>the</strong> s<strong>in</strong>gle<br />
def<strong>in</strong>itive screen<strong>in</strong>g design to identify and model all <strong>the</strong><br />
active effects obviated <strong>the</strong> need for fur<strong>the</strong>r<br />
experimentation, reduc<strong>in</strong>g <strong>the</strong> total number <strong>of</strong><br />
experimental runs required to 17 from <strong>the</strong> greater than<br />
or equal to 70 runs that would have been necessary<br />
us<strong>in</strong>g <strong>the</strong> traditional screen<strong>in</strong>g/RSM approach.”<br />
4/29/2013
Awards for Def<strong>in</strong>itive Screen<strong>in</strong>g <strong>Design</strong>s<br />
“A New Class <strong>of</strong> Three-Level Screen<strong>in</strong>g <strong>Design</strong>s for<br />
Def<strong>in</strong>itive Screen<strong>in</strong>g <strong>in</strong> <strong>the</strong> Presence <strong>of</strong> Second-Order<br />
Effects”, Journal <strong>of</strong> Quality Technology, Jan., 2011<br />
•Brumbaugh Award, 2011<br />
•Lloyd S. Nelson Award, 2012<br />
•(Led to) Statistics <strong>in</strong> Chemistry Award, 2012<br />
•Much work now <strong>in</strong> comb<strong>in</strong>atorial aspects<br />
4/29/2013
Upshot – Def<strong>in</strong>itive Screen<strong>in</strong>g <strong>Design</strong>s<br />
1. Our view: eng<strong>in</strong>eers, scientists prefer three levels.<br />
2. Can estimate curvatures<br />
3. Can disentangle <strong>in</strong>teractions<br />
4. <strong>Design</strong>s project to fully estimable response surface designs <strong>in</strong> two<br />
or three factors, so with effect sparsity, two experiments for <strong>the</strong><br />
price <strong>of</strong> one.<br />
5. We see little or no reason to cont<strong>in</strong>ue <strong>the</strong> practice <strong>of</strong> us<strong>in</strong>g 2 k-p<br />
designs for four or more cont<strong>in</strong>uous factors!!<br />
4/29/2013
BUT!…a limitation<br />
DSDs can only be used with<br />
cont<strong>in</strong>uous (quantitative) factors<br />
4/29/2013<br />
92
BUT!…a limitation<br />
DSDs can only be used with<br />
cont<strong>in</strong>uous (quantitative) factors<br />
DANG!!!!<br />
4/29/2013<br />
93
Not any more:<br />
• Journal <strong>of</strong> Quality Technology, to appear, April<br />
2013<br />
4/29/2013<br />
94
Example 6: Extract<strong>in</strong>g food solids from peanuts <strong>in</strong><br />
solution, aga<strong>in</strong>, n = 18<br />
Factor Low High<br />
1 Water pH level 6.95 8.0<br />
5 cont<strong>in</strong>uous<br />
factors<br />
2 Water temp 20C 60C<br />
3 Extraction time 15 40<br />
4 Water-Peanuts Ratio 5 9<br />
5 Agitation speed 5,000 10,000<br />
2 categorical<br />
factors<br />
6 Hydrolyzed? N Y<br />
7 Presoak<strong>in</strong>g? N Y<br />
4/29/2013
<strong>JMP</strong> Demo<br />
Recall our conclusion: There are three active <strong>in</strong>teraction<br />
effects, but cannot identify <strong>the</strong>m!<br />
Us<strong>in</strong>g <strong>the</strong> same number <strong>of</strong> runs (18) <strong>the</strong> def<strong>in</strong>itive<br />
screen<strong>in</strong>g design will not only identify <strong>the</strong> same ma<strong>in</strong><br />
effects, but it will:<br />
• Investigate all curvatures (conclud<strong>in</strong>g <strong>the</strong>re are none)<br />
• Investigate all <strong>in</strong>teractions and def<strong>in</strong>itively identify <strong>the</strong><br />
three active <strong>in</strong>teractions<br />
4/29/2013
So, what are <strong>the</strong> current best practices?<br />
Current best practices are:<br />
• Problem-driven<br />
• Three-level, where<br />
possible<br />
• Multi-objective<br />
• Supersaturated<br />
4/29/2013