09.08.2013 Views

Fundamentals of epidemiology - an evolving text - Are you looking ...

Fundamentals of epidemiology - an evolving text - Are you looking ...

Fundamentals of epidemiology - an evolving text - Are you looking ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Algorithms - A procedure that uses a set <strong>of</strong> criteria according to specific rules or considerations,<br />

e.g., major depressive disorder, "effective" contraception (I have not seen this term used to designate<br />

a type <strong>of</strong> variable before, but I am not aware <strong>of</strong> <strong>an</strong>y other term for this concept).<br />

Preparatory work – Exploring the data<br />

Try to get a "feel" for the data – inspect the distribution <strong>of</strong> each variable. Examine bivariate<br />

scatterplots <strong>an</strong>d cross classifications. Do the patterns make sense? <strong>Are</strong> they believable?<br />

Observe shape – symmetry vs. skewness, discontinuities<br />

Select summary statistics appropriate to the distribution <strong>an</strong>d variable type (nominal, ordinal,<br />

measurement)<br />

Location - me<strong>an</strong>, medi<strong>an</strong>, percentage above a cut-point<br />

Dispersion - st<strong>an</strong>dard deviation, qu<strong>an</strong>tiles<br />

Look for relationships in data<br />

Look within import<strong>an</strong>t subgroups<br />

Note proportion <strong>of</strong> missing values<br />

Preparatory work – Missing values<br />

Missing data are a nuis<strong>an</strong>ce <strong>an</strong>d c<strong>an</strong> be a problem. For one, missing responses me<strong>an</strong> that the<br />

denominators for m<strong>an</strong>y <strong>an</strong>alyses differ, which c<strong>an</strong> be confusing <strong>an</strong>d tiresome to explain. Also,<br />

<strong>an</strong>alyses that involve multiple variables (e.g., coefficient alpha, crosstabulations, regression models)<br />

generally exclude <strong>an</strong> entire observation if it is missing a value for <strong>an</strong>y variable in the <strong>an</strong>alysis (this<br />

method is called listwise deletion). Thus, <strong>an</strong> <strong>an</strong>alysis involving 10 variables, even if each has only<br />

5% missing values, could result in excluding as much as 50% <strong>of</strong> the dataset (if there is no overlap<br />

among the missing responses)! Moreover, unless data are missing completely at r<strong>an</strong>dom (MCAR<br />

– equivalent to a pattern <strong>of</strong> missing data that would result from deleting data values throughout the<br />

dataset without <strong>an</strong>y pattern or predilection whatever), then <strong>an</strong> <strong>an</strong>alysis that makes no adjustment for<br />

the missing data will be biased, because certain subgroups will be underrepresented in the available<br />

data (a form <strong>of</strong> selection bias).<br />

Imputation for missing values - optional topic<br />

As theory, methods, <strong>an</strong>d computing power have developed over the years, <strong>an</strong>alytic methods<br />

for h<strong>an</strong>dling missing data to minimize their detrimental effects have improved. These<br />

methods seek to impute values for the missing item responses in ways that attempt to<br />

increase statistical efficiency (by avoiding the loss <strong>of</strong> observations which have one or a few<br />

missing values) <strong>an</strong>d reduce bias. Earlier methods <strong>of</strong> imputation, now out <strong>of</strong> favor, include<br />

replacing each missing value by the me<strong>an</strong> or medi<strong>an</strong> for that variable. Even though such<br />

practices enable all observations to be used in regression <strong>an</strong>alyses, these methods do not<br />

_____________________________________________________________________________________________<br />

www.sph.unc.edu/courses/EPID 168, © Victor J. Schoenbach 14. Data <strong>an</strong>alysis <strong>an</strong>d interpretation – 459<br />

rev. 11/8/1998, 10/26/1999, 12/26/1999

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!