Introduction to Design and Analysis of Experiments: - LISA

Introduction to Design and 

Analysis of Experiments: 

a LISA short course 

Jonathan Stallings 

June 17, 2013

My Qualifications 

 

 

 

 

4 th Year PhD Statistics Student 

BS in Mathematics, UMW 

MS in Statistics, VT 

Main research interest is Experimental Design 

I help researchers answer their specific 

questions by giving them the best possible 

“tools” to collect and analyze their data.

A little about LISA 

 

 

Laboratory for Interdisciplinary Statistical Analysis 

Free Collaboration: 

– Experimental Design – Data Analysis – Software Help 

 

 

Interpreting Results – Grant Proposals 

Free Walk-In Consulting for quick statistics questions 

Free Short Courses 

– R tutorial; Structural Equation Modeling; Plotting Data 

Improve research quality through 

project collaboration and 

statistical consulting

Requesting a LISA Meeting 

 

 

 

 

 

To request a collaboration go to: www.lisa.stat.vt.edu 

Sign in using VT PID and password 

Enter your information (e-mail, college, etc.) 

Describe your project (project title, research goals, 

specific research questions, do you have data, etc.) 

Contact assigned LISA collaborators as soon as possible to 

schedule meeting 

We prefer to meet with you before 

data collection!

Short Course Goals 

 

 

 

 

Discuss difference between designed experiment and 

observational study 

Introduce terminology used by statisticians to make 

collaboration easier 

Detail fundamentals of a good design 

Explain how to analyze data in JMP to answer research 

questions (conclusions and interpretations)

What we won't be talking about 

 

 

 

 

 

Designing Surveys 

Sample size calculations 

Assumption Checking 

Measurement Error 

Sequential Design (Response Surface methodology) 

These are important design questions, but some involve more 

advanced design techniques and statistical knowledge. LISA 

collaboration meetings are ideal for these questions.

Sources of Variation

Sources of Variation: Example 

 

 

Flip a quarter: Heads 

Flip a nickel: Tails 

Wikipedia.org 

Marshu.com 

 

Why did we get two different responses? 

We may argue that if we identify all aspects that went into a 

given coin toss and can perfectly replicate it, we will get the 

exact same result.

Sources of Variation 

 

 

A source of variation is anything that could cause an 

observation to be different from another observation 

Characteristics 

– Degree of variability 

– Consistency of effect 

– Can it be controlled? 

 

The goal is to identify few sources of variation that 

explain the majority of the variability.

Explaining Variance VS 

Sources of Variation 

 

Large cities have recorded simultaneous increases in 

monthly murder rates and ice cream sales. 

Credit: Taylor Dewey, using public images 

 

In this case, ice cream sales help “explain variance” 

of murder rates, but aren't a source of variation

Two types of Major Sources of 

Variation 

 

Those that can be controlled and are of interest are 

called treatments or treatment factors 

• Drug in medical experiment 

• Settings on machine producing tires 

• Different types of political advertising to encourage voting 

 

Those that are not of interest but are difficult to control 

are nuisance factors 

• Sex 

• Age 

• Weather

Dealing with Sources of Variation 

 

 

 

 

The primary goal of an experiment is to determine the 

amount of variation caused by the treatment factors in 

the presence of other sources of variation. 

Want the majority of the variability of the data to come 

from the treatment factors 

A good design will minimize the impact of minor 

sources of variation, while taking into account 

variability caused by nuisance factors 

Let's go back to the coin example...

Sources of Variation: Example 

 

 

 

 

Response: Probability of flipping heads 

Potential treatment factors: 

• Type of coin 

• Heads/Tails up before flip 

Nuisance factors: 

• Person flipping coin 

Minor sources of variability: 

• Environmental factors (e.g. wind)

Sources of Variation: Summary 

 

 

 

 

 

A source of variation is anything that causes an 

observation to be different from another observation 

List potential major and minor sources of variation 

before collecting data 

A treatment factor can be controlled 

Minimize the impact of minor sources of variation, and 

be able to separate effects of nuisance factors from 

treatment factors 

We want the majority of the variability of the data 

to be explained by the treatment factors

Observational Study 

vs 

Designed Experiment

Observational Studies VS 

Designed Experiment 

http://xkcd.com/552/

Your conclusions are only as good as 

your data

Observational Studies 

 

 

 

When the researcher has little/no control over sources 

of variation and simply observes what's happening 

Examples: 

– Surveys 

– Investigating effects of cancer on human subjects 

– Weather patterns, stock market price, etc. 

These types of studies relate to the statement: 

Correlation does not imply causation

Designed Experiment 

 

 

 

The researcher identifies and controls sources of 

variation that significantly impact the measured 

response 

Examples: 

– Assign different medications to subjects with a similar 

illness 

– Assign different credit card limit rates to customers 

with a similar financial situation 

– Assign different amounts of carcinogen to lab rats 

Differences between observations primarily result 

from treatment factors (evidence of causation)

Example 1: Obs Study or Design? 

 

 

 

Differences in milk butter fat for cows 

Potential sources of variation: 

– Three age groups (A1, A2, A3) 

– Four breeds (B1, B2, B3, B4) 

Randomly select two cows from each age/breed 

B1 B2 B3 B4 

A1 

A2 

A3

Example 2: Obs Study or Design? 

 

 

 

Develop a treatment to increase milk butter fat 

Potential sources of variation: 

– Age and Breed 

– Treatment (Yes or No) 

Randomly assign treatment to one of the two cows from 

each age/breed 

B1 B2 B3 B4 

A1 Y/N N/Y ... 

A2 

A3

Obs Study VS Design: Summary 

 

 

 

 

Observational studies have minimal intervention by the 

researcher, weakening conclusions about causation 

With designed experiments, the researcher has much more 

control and does everything possible to make variability 

due to treatment factors alone 

Researchers often use observational studies to generate 

hypotheses and test them using designed experiments 

Data collected from both scenarios are analyzed the 

same way, but the conclusions are different.

Example of Designed Experiment

Design Example 

 

 

 

Your child comes home from school and shows you what they 

learned in class. 

He/she asks for a film canister and an Alka-Seltzer tablet. They 

fill the canister with a little water, put the tablet in the water, 

close the canister and turn it upside down. 

After a few seconds, the canister flies in the air! Your child 

wants to know how to make the canister fly as high as possible. 

= BOOM! 

http://www.bbc.co.uk/leicester Water drop | Stock Vector © 

Natalja Jatsuk #2449987 

http://www.aqualuxcarpetcleaning.com 

http://www.youtube.com/watch?v=Gtbane7BBdQ&feature=related


 

 

Question: Does the amount of alka-seltzer affect flight 

time? Which amount gives the longest time? 

Three different amounts of alka-seltzer 

https://healthy.kaiserpermanente.org 

1/2 Tablet 1 Tablet 1.5 Tablets 

 

Response: Time from liftoff to landing in seconds


 

 

 

What are some sources of variation? 

– Amount of alka-seltzer (Treatment Factor) 

– Amount of water 

– Film canister seal 

– Time Measurement 

– Angle of liftoff 

Note: We could control amount of water, but are more 

interested in the amount of alka-seltzer 

Focus on the major sources of variation!

Design Example: Summary 

 

 

We need to reduce the impact of significant sources 

of variation other than the alka-seltzer amount. 

• Let's keep the amount of water constant, say 1/2 full 

• Perform experiment inside to reduce environment impact 

How do we actually perform the experiment? 

• How many times do we need to shoot the canister? 

• What order do we test the tablet amount? Does it matter? 

• Should we use different types of film canisters? 

• How are we going to measure time?

Fundamentals of Design

Experimental Units and Blocks 

 

 

 

 

An experimental unit (EU) is the “material” to which 

treatment factors are assigned 

– Emphasis on the researcher administering the treatment! 

– For the milk butter fat experiment, the cows are the EUs 

Usually we want EUs to be as similar as possible, 

but that isn't always realistic 

A block is a group of EUs more similar than other EUs 

A blocking factor is the characteristic used to create 

the blocks.

Three Fundamental Principles 

 

 

A design is the proposed allocation of treatments to 

EUs 

Three fundamental concepts to any design: 

– Replication of treatment 

– Randomization of treatment assignment 

– Local error control 

• Analysis of Covariance (ANCOVA) 

 

• Blocking of EU's 

Neglecting to acknowledge these will result in 

potential bias and skepticism

Film Canister Experiment 

Sxc.hu 

 

 

 

Treatments: Three different amounts of Alka-Seltzer 

EUs: Assume we have 9 nearly identical film 

canisters. 

How do we use the fundamental principles to compare 

these two designs? 

Run Order 1 2 3 4 5 6 7 8 9 

Design 1 1 1 1 1 0.5 0.5 0.5 1.5 1.5 

Design 2 1.5 0.5 1.5 1.5 1 1 1 0.5 0.5 

Each box is an 

EU

Replication 

 

 

 

 

 

Replicating a treatment means assigning that treatment 

to multiple EU's 

Increasing replication → Decrease in variance 

If equal interest in estimating the treatments, try to 

equally replicate the number of treatment 

assignments 

Related question: How many times to replicate? 

FC Example: There are three treatments (tablet size) and 

say we use 9 canisters. So 9/3=3 reps

Replication 

Run Order 1 2 3 4 5 6 7 8 9 



Design 2 replicates each tablet size three times. 

Design 1 replicates one table four times, so it will 

estimate effect of one tablet better than the other 

tablet sizes.

Randomization 

 

 

 

 

Randomly assign which EU gets a treatment 

Reduces possibility of most types of bias caused by 

minor and undetectable sources of variation 

How we randomize depends on the type of design 

FC Example: Some film canisters may have a small, 

indetectable hole, affecting the pressure necessary to launch 

the canister. Randomizing will give every treatment the same 

chance of being affected by this and will not be confounded 

with any treatment if we repeat the experiment many times.


Run Order 1 2 3 4 5 6 7 8 9 



It's possible to get both run orders by randomization 

Design 1 has a clear pattern, while Design 2 “looks” more 

random 

As long as you used a proper randomization device for 

both designs, use the randomization given to you, 

even if it looks to have a pattern.

Local Error Control 

 

 

 

In general, this is any technique to improve accuracy and 

precision of measuring treatment effects in the design 

Simple example: Have the same person take all 

measurements or operate machinery 

Techniques affecting analysis and/or design: 

– Analysis of Covariance (ANCOVA) 

– Blocking

Local Error Control: ANCOVA 

 

 

 

 

A covariate is a potential source of variation that we can't 

control but can measure during an experiment 

Differences in treatments can be difficult to detect if we 

don't take into account covariate effect 

This does not change the design procedure 

Basic idea: 

– Estimate relationship of covariate and response 

– Compare treatments given this relationship

Local Error Control: ANCOVA 

 

 

Example: Suppose we did film canister experiment outside 

and there were unpredictable wind gusts 

How does neglecting this information affect comparisons of 

alka-seltzer amount?

Local Error Control: Blocking 

 

 

 

 

Group EU's so that each block contains EU's that 

are more “homogeneous” 

Separate randomizations for each block 

Just like with ANCOVA, we account for differences in 

block and then compare the treatments 

Example: Age/Breed combination was a blocking 

factor for the milk butter fat example when we wanted 

to compare treatments

Local Error Control: Blocking 

 

FC Example: Maybe we want to use three different 

types of film canisters which we feel may be 

significantly different from each other. 

Block 

Each box 

represents an EU 

with the block trait 

Bulletin.accurateshooter.com 

9 EU's in each 

block, call this 

“block size” 

Artnexus.com

Design Fundamentals: Summary 

 

 

 

 

 

 

An experimental unit is what we assign/apply 

treatments to 

A block is a group of EUs more similar than other EUs 

Replication and randomization increase precision and 

reduce known/unknown sources of bias 

Accounting for covariate and block effects improves 

ability to detect treatment differences... 

...but we can't make causal inferences about them! 

Causal inference about treatment effects

What is a Replicate?

More on Replication 

 

 

 

An experimental unit is the material we assign/apply 

one treatment replicate to 

Common question: How many replicates do I need? 

Need to consider: 

– Goals of experiment 

– $$$ 

– Are treatments or EUs expensive?

Determining Sample Sizes 

 

 

 

 

There are many things we need to know or guess to 

suggest sample sizes 

Variability 

– The more variability, the more replicates 

Minimal treatment difference to detect 

– The smaller the difference, the more replicates 

Collaboration meetings are ideal for these 

calculations

Subsampling: Pseudoreplication 

 

 

 

Naïve idea: Taking multiple measurements on the 

EU can be counted as a replication 

Variability in multiple measurements is measurement 

error not experimental error! 

The different measurements are called observational 

units (OUs) 

Approvedgasmasks.com Onyxinvesting.com Stopwatchsh.com

Consequences of Pseudoreplication 

 

 

 

Usually people average the measurements from the 

OUs and treat it as one observation 

What if we don't do this? 

– We severely underestime error 

– Overexaggerate true treatment differences 

What if measurement error is high? 

– Try to improve measurement process 

– Revisit experiment and assess homogeneity of 

EUs and think of potential covariates

EU vs OU: Example 

 

 

Applying nitrogen concentration to compost 

What's the EU and what's the OU? 

Apply Nitrogen 

Break up 

into three 

piles 

Take 

measurements 

on the three 

smaller piles 

Gardenphoto.com

Replication vs Subsample: Summary 

 

 

 

 

You can only replicate a treatment if you apply it to a 

new EU 

Determining sample sizes requires assumptions about 

variability 

Subsampling determines reliability of measurements 

Treating a subsample as a replicate increases the 

chance of incorrect conclusions

Completely Randomized Design

Completely Randomized Designs 

 

 

The simplest design assumes all the EU's are similar 

and the only major source of variation are the 

treatments 

A completely randomized design (CRD) will 

randomize all treatment-EU assignments for the 

specified number of treatment replications 

If equally interested in comparisons of all treatments 

get as close as possible to equally replicating the 

treatments

CRD Example: FC Experiment 

These are “similar” EU's 

The Design Plan: 

Before 


1/2 Tablet 1 Tablet 1 1/2 Tablet

CRD Example: FC Experiment 

The 

Implemented 

Design

Analysis of CRD: Plots 

 

 

 

Boxplots compare responses for different 

treatments 

Do the medians match up? 

Is the spread the same?

Analysis of CRD: ANOVA Table 

Overall ANOVA 

Effect Tests 

 

 

ANOVA partitions total variability into separate, 

independent pieces: 

– MSTrt: Variability due to treatment differences 

– MSError: Variability due to experimental error 

If MSTrt > MSError then treatments likely have 

different effects!

Analysis of CRD: Contrasts 

Estimated mean 

difference 0.31 

with 95% 

confidence 

interval: 

(-1.6086, 2.22857) 

 

 

 

At least two treatments are different, which ones? 

Pairwise comparisons 

Use Tukey HSD for multiple pairwise comparisons

Analysis of CRD: Example 

 

 

 

 

Design implementation: Fill the canisters halfway with 

water for each run 

– Replicate each treatment 3 times 

Plot the data! 

ANOVA table for overall treatment differences 

Post-hoc tests: treatment comparisons (contrasts)

CRD & ANOVA: Summary 

 

 

 

 

 

 

CRD has one overall randomization 

Try to equally replicate all the treatments 

Plot your data in a meaningful way to help visualize 

analysis 

Use ANOVA to test for an overall difference 

Look at specific contrasts of interest to better 

understand relationship between treatments 

JMP and other software are great tools but be careful 

in reading and interpreting output

CRD Extended: Factorial Treatments

CRD Extension: Factorial Treatments 

 

 

 

 

Treatments could be combination of multiple factors with 

different levels (think settings) 

We could do a separate experiment for each factor, but 

this is not necessary if we design carefully 

Example: For the FC experiment we may also vary water 

amount (low/medium/high). In this case one “treatment” 

is actually a combination of tablet and water amount 

The specific tablet and water amounts are referred to 

as the levels of the tablet factor and water factor, 

respectively.

Factorial Example: FC Experiment 

1/2 tablet / Low water 

1 1/2 tablet / High Water

Factorial Example: FC Experiment

Interaction Plots 

Interaction! 

 

 

Plots of the treatment means 

Look at the behavior of the means as the levels vary

Factorial ANOVA: Cell Means 

This “cell mean” = mean of 

all A1/W1 reps 

This is the 

average for 

all obs with 

W1 

 

 

 

A1 

A2 

A3 

W1 W2 W3 

2.06 1.74 0.47 1.42 

2.09 1.57 0.42 1.36 

2.02 1.55 0.39 1.32 

2.06 1.62 0.43 

Water levels 

Partition SSTrt into main effects and interactions 

Average water levels different → Water main effect 

Differences between water levels changes depending on 

alka-seltzer amount → Water/Alka interaction

Factorial Analysis: Example 

 

 

Vary water and alka-seltzer amount (3 levels each) 

Only do 1 replicate each: 

– No degrees of freedom are left for MSE 

– First plot the effects using a half-normal QQ plot 

– Remove insignificant effects and then do ANOVA

Factorial Design: Summary 

 

 

 

 

 

Efficient way to test effect of multiple treatment 

factors 

One treatment is combination of multiple factors 

We may extend to more than two factors, but the 

number of necessary EU's grows rapidly! 

Interaction plots help visualize effects 

Main effects and interactions are specific types of 

important treatment comparisons

Complete Block Design

Blocking to Reduce Variance 

 

 

 

 

We think there is some source of variation that is an 

inherent trait of the EUs 

A block is a group of EUs more similar than the other 

EUs 

Basic Idea: Compare treatments within blocks to 

account for source of variation 

If blocking has a significant effect, we can greatly 

reduce the variability of the treatment effects.

Blocking to Reduce Variance 

Design questions: 

– How many EUs per block? 

– How do we assign treatments to the EUs? 

– How do we randomize? 

Block 1 Block 2 Block 3 

This is not a pretty 

design situation. What 

are some problems we 

may run into?

Block Examples 

 

 

 

 

 

From FC example, we blocked by canister 

Male and Female 

Plots in a field (close together more similar) 

Note that in all of these cases, we cannot assign a 

block to an EU, it is an inherent property of the EU 

It's possible to create blocks (groups) from 

covariate information but we have to be able to 

randomize the treatments within the blocks!

Assigning treatment to blocks 

 

Remember, we want to “remove” block effects to 

increase precision of treatment effects 

What's wrong with this design? Say we have 2 

treatments: T1 and T2 

Run Order 1 2 3 4 5 6 7 8 9 

Block 1 T1 T1 T1 T1 T1 T1 T1 T1 T1 

Block 2 T2 T2 T2 T2 T2 T2 T2 T2 T2 

 

Complete block design: 

# EUs per block = # treatments

Block Design: RCBD 

 

 

 

 

The block size is the number of EUs for the block 

If the block size equals the number of treatments we 

call this a randomized complete block design. 

Think of this as separate CRD's for each block with 

one replicate. So when we randomize we want to... 

RANDOMIZE TREATMENTS IN EACH BLOCK 

We can test if the block means are different, but 

cannot conclude differences were caused by the 

blocking factor.

RCBD Analysis: FC Example 

1 1 

1 2 

1 3 

2 1 

2 2 

2 3 

3 1 

3 2 

1 1 

1 2 

1 3 

2 1 

2 2 

2 3 

3 1 

3 2 

1 1 

1 2 

1 3 

2 1 

2 2 

2 3 

3 1 

3 2 

 

 

 

Recall, the EU's in the 

blocks are the time order of 

reuses of same canister 

1 1 means 1/2 tablet, low 

water; 3 3 means 1 1/2 

tablet, high water 

Recall, we randomize 

within each block (3 total 

randomizations) 

3 3 

3 3 

3 3

RCBD Analysis: FC Example 

1 2 

2 3 

3 1 

2 1 

1 1 

2 2 

3 3 

1 3 

3 2 

2 1 

3 1 

3 2 

2 3 

1 3 

1 2 

1 1 

2 2 

3 3 

1 1 

3 1 

3 3 

3 2 

2 1 

1 3 

2 3 

1 2 

2 2

Assessing block efficiency 

 

We hope that blocking will account for a lot of 

SSError → Reduction in experimental error 

 

Software is going to give you a p-value for Block, but 

only use this to gauge how much we reduced 

experimental error 

 

If MSBlock is insignificant, can we do CRD analysis?

RCBD: Analysis 

 

 

 

 

 

Here we use 6 different film canisters (blocks) and 

have 9 runs for each block = 54 totals observations 

Interaction plots including block information 

Assess blocking efficiency: MSB/MSE 

Block/Treatment interactions? 

Post-hoc analysis follows naturally

Complete Block: Summary 

 

 

 

 

 

Blocking is a technique to reduce experimental 

error 

Also broadens the validity of treatment effects 

No causal inference for block effects! 

Analysis is similar to CRD and factorial 

RCBD is a simple block design where block size 

equals number of treatments

Other Designs 

and 

Overall Summary

More design scenarios 

 

 

 

 

 

Fractional factorial with blocking 

– Many factors but only low-level interactions 

Incomplete block designs 

– Block size < # of treatments 

Crossed blocking factors 

– Think factorial effects but for blocks 

Split-plot designs 

– Factorial effects, but some factors are harder to change 

Repeated Measures 

– Multiple measurements taken across time (autocorrelation)

Overall Summary 

 

 

 

 

Introduction to design terminology and fundamental 

principles 

Specifically looked as CRD, factorial and RCBD 

design scenarios 

Briefly covered analysis approach and emphasized the 

appropriate interpretations that help with answering 

research questions 

LISA can help you design efficient experiments that 

will help you answer your research questions!

Introduction to Design and Analysis of Experiments: - LISA

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?