18.01.2013 Views

Controlled Lab Experiments - Allen Bevans

Controlled Lab Experiments - Allen Bevans

Controlled Lab Experiments - Allen Bevans

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

IAT 432<br />

Week 6<br />

<strong>Controlled</strong> <strong>Experiments</strong> 1/3<br />

(Assignment 3)<br />

Scientific Method and<br />

Hypothesis Testing


• Examples?<br />

What is Science?<br />

2


• Examples?<br />

What is Science?<br />

• How is that different what Designers do?<br />

3


• Systematic inquiry<br />

What is Science?<br />

• Community standards (rigor)<br />

• Repeatable<br />

4


Descriptions vs Explanations<br />

• Description: Qualitative<br />

– Words<br />

– Summarized via top-down or bottom-up analysis, written<br />

summary.<br />

– “I observed these statements…”<br />

• Description: Quantitative<br />

– Numerical data sets<br />

– Summarized via stats, graphs, etc.<br />

– “I observed these measurements…”<br />

5


Descriptions vs Explanations<br />

• Explanation: Qualitative<br />

– Top-down or bottom-up analysis<br />

– Shaped by what we expect<br />

– “These statements suggest…”<br />

• Description: Quantitative<br />

– Statistical analysis<br />

– Shaped by what we expect<br />

– “These measurements suggest…”<br />

6


Descriptions vs Explanations<br />

• Explanation: Qualitative<br />

– Top-down or bottom-up analysis<br />

– Shaped by what we expect<br />

– “These statements suggest…”<br />

• Description: Quantitative<br />

– Statistical analysis<br />

– Shaped by what we expect<br />

• How do we know what to expect?<br />

– “These measurements suggest…”<br />

7


• Theory<br />

• Causes and effects<br />

Scientific Method<br />

– A causes B effect or A affects B.<br />

• Testable hypothesis<br />

• <strong>Controlled</strong> Experiment with subset of population<br />

• Evidence to support hypothesis?<br />

• Generalize to population<br />

8


Experimental Method<br />

• Scientific Method � Empirical Method<br />

• “Evidence”<br />

• Claims/hypotheses<br />

• Quantitative Data<br />

• Objective<br />

9


A � B<br />

Causes and Effects In Usability<br />

• A = cause = input or interface feature<br />

• B = effect = human performance, preference,<br />

experience<br />

• Common form:<br />

– A1 is better than A2 for causing B<br />

10


<strong>Controlled</strong> Experiment Approach<br />

A controlled experiment for usability evaluations is good<br />

for certain kinds of questions …<br />

• Is one design better in terms of usability than another?<br />

• Does a change in interface feature change usability?<br />

– i.e. performance (effectiveness/efficiency) or preference<br />

(satisfaction)<br />

11


Mouse Size Example<br />

• Theory: Mouse size affects children’s performance on<br />

selection tasks<br />

• Hypothesis: Children can select targets faster with a<br />

small mouse compared to a regular mouse.<br />

• Comparative controlled experiment<br />

– Cause: Mouse size -- e.g., MS Optical mouse 15 cm L x 10<br />

cm W x 7 cm H vs 10 cm L x 7 cm W x 4 cm H<br />

– Effect: task speed from beginning to select<br />

– Population: children age 4-6<br />

12


• Empirical<br />

Term<br />

= Relying on or derived from observation<br />

For example, a user-based usability study is an<br />

empirical study.<br />

13


• Hypothesis (singular)<br />

• Statement<br />

• Causes and effects<br />

Hypotheses<br />

• Interface design A causes B<br />

• Comparing two designs<br />

– A1 is better than A2 for B<br />

14


• Example<br />

True or Not? The Logic of Proofs<br />

– Statement: All swans are white.<br />

– Observe 10,000,000 swans.<br />

– All the observed swans are white.<br />

– Proof?<br />

15


True or Not? The Logic of Proofs<br />

• Can NEVER prove a hypothesis …<br />

• Can NEVER prove a statement with observations.<br />

• Can only find support for or disprove.<br />

• Example<br />

– Statement: All swans are white.<br />

– Observe 10,000,000 swans.<br />

– All the observed swans are white.<br />

– Proof? No. Support: Yes.<br />

16


Logic of Proofs cont’<br />

Prove by disproving opposite statement.<br />

How to disprove … If all swans are white, then no swans are<br />

black.<br />

Opposite statement: No swans are black (blue, pink,<br />

orange)<br />

Find 1 black swan!<br />

No swans are black is disproven.<br />

17


Logic of Proofs cont’<br />

In Scientific method – the approach is to disprove a “null”<br />

hypothesis.<br />

All we can say, then, is “Evidence supports…”<br />

18


Apply logic of proofs to hypotheses<br />

• Opposite is “no effect” … “no better”<br />

• Called “null” hypothesis …<br />

• E.g. Smelliness and dating<br />

• General = Smell effects number of dates.<br />

• Null = Smell does not effect number of dates.<br />

19


Apply logic of proofs to hypotheses<br />

• Opposite is “no effect” … “no better”<br />

• Called “null” hypothesis …<br />

• E.g. Smelliness and dating<br />

• General = Smell effects number of dates.<br />

• Null = Smell does not effect number of dates.<br />

• Directional = Smell gets more dates.<br />

– Better smell gets more dates?<br />

20


Operationalizing the Hypothesis<br />

• Hypothesis must be testable<br />

• Operationalize �<br />

1. Isolate & specify cause (e.g. interface feature)<br />

2. Measure effect (performance, preference,<br />

experience) on some kind of activity/task<br />

3. Specify population<br />

21


Mouse Size Example<br />

• Theory: Mouse size affects children’s performance on<br />

selection tasks<br />

• Hypothesis: Children can select targets faster with<br />

small mouse than regular mouse.<br />

• Comparative two designs in a controlled experiment<br />

– Cause: Mouse size -- e.g., MS Optical mouse 15 cm L x 10<br />

cm W x 7 cm H vs 10 cm L x 7 cm W x 4 cm H<br />

– Effect: task speed from beginning to select<br />

– Population: children age 4-6<br />

22


Mouse Size example cont’<br />

• Measure task time for small and regular groups<br />

• Quantitative Data<br />

– Two data sets, one for small mouse and one for regular<br />

mouse group<br />

– For each -- average time value across 10 tasks for 20<br />

children.<br />

• On average, is a smaller mouse faster?<br />

• If it is, did we prove hypothesis? NO…<br />

23


Mouse Size Example<br />

• Say we found on average, that children were faster<br />

with smaller mouse.<br />

• Null Hypothesis: Mouse size does not affect children’s<br />

speed on target selection tasks.<br />

• But we found, on average for 20 children that they<br />

were faster with smaller mouse …<br />

• So Null can’t be true … disprove Null.<br />

• Original hypothesis is “supported” (not proved)<br />

24


More terms … Variables<br />

A variable is something that changes and can have<br />

different values that can be specified or measured<br />

Examples<br />

Font size = 8, 10, 12 (varies & can be 8 or 10 or 12)<br />

Colour = red, green, blue (varies & can be one of … )<br />

Time = n seconds (n varies & can be 0 – 600 seconds)<br />

Error rate = x% (x varies & can be 0 – 100%)<br />

Subject type = novice or expert, male or female<br />

25


Kinds of Variables<br />

• Cause � Independent Variable (IV)<br />

– The input or interface feature you have different designs<br />

for (e.g., mouse size = small or regular)<br />

– Characteristics of users (novice/expert)<br />

• Effect � Dependent Variable (DV)<br />

– the human behaviors or experiences that you measure for<br />

each level of the IV.<br />

– E.g., task time = 0 – 600 seconds<br />

26


Independent Variable<br />

• The thing experimenter change or manipulate<br />

independent of users’ behavior to see how it affects<br />

users’ behavior<br />

– Often some small aspect of an interface feature<br />

– Can also arise from grouping users (expert/novice)<br />

• Examples<br />

– Font size 8, 10 or 12<br />

– Keyboard layout style phonepad or alphabetic<br />

– Expert vs novice users<br />

27


Dependent Variable<br />

• A variable that depends on users’ behaviors<br />

• The thing you measure<br />

• So, a dependent variable is some aspect of behavior<br />

that changes/varies and can be measured like “task<br />

time” or “rating”<br />

28


Hypothesis<br />

• Related independent and dependent variables<br />

• A change in indep var causes an effect on dep variable<br />

• Indep var = mouse size; dep var = task time<br />

• Hyp: A smaller mouse size improves task time for<br />

children age 4-6 on a target selection task.<br />

• Null: Mouse size does not affect task time …<br />

29


• Hypothesis<br />

Basic Form of CE for Usability Study<br />

• Independent variable = interface aspect(s)<br />

• Dependent variable = human performance/preference<br />

• Sample Population<br />

• Select levels of IV that are varied between groups<br />

• Measure DV(s) for each group<br />

30


About reading comprehension…<br />

31


About reading comprehension…<br />

32


About reading comprehension…<br />

33


About reading comprehension…<br />

34


This Week’s Studio<br />

• Work through Assignment 3<br />

• Install and run software<br />

• Data collection<br />

• Meet new team<br />

35


Next Lectures<br />

• Week 7: More on controlled experiments<br />

– Validity and Reliability<br />

• Week 7: Review of methods to date<br />

• Week 8: Analysis/Statistics … how do you know from<br />

the data sets that the DVs for groups are different?<br />

36


Read<br />

• Martin, Doing <strong>Experiments</strong> in Psychology<br />

• Chapters 1,2,7,8<br />

• Glossary<br />

• Dix Chapter 9 Handout 9.4.2<br />

37

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!