09.08.2013 Views

Fundamentals of epidemiology - an evolving text - Are you looking ...

Fundamentals of epidemiology - an evolving text - Are you looking ...

Fundamentals of epidemiology - an evolving text - Are you looking ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Obviously it will be easier to assemble these materials if careful documentation is prepared along the<br />

way. As a minimum every document should bear a date <strong>an</strong>d a notation <strong>of</strong> authorship. Documents<br />

retained in a word processing file should if possible contain a notation <strong>of</strong> where the document file<br />

resides, so that it c<strong>an</strong> be located for later revision or adaptation in creating related documents.<br />

1.4.9 Poke around<br />

The amount <strong>of</strong> activity <strong>an</strong>d detail in a large project c<strong>an</strong> easily exceed what the (usually<br />

limited) staff (<strong>an</strong>d time-urgent investigators) are able to h<strong>an</strong>dle comfortably. Despite the highest<br />

motivation <strong>an</strong>d experience, communication will be incomplete <strong>an</strong>d import<strong>an</strong>t items will be<br />

overlooked. Investigators should review data forms regularly to familiarize themselves with the data<br />

in its raw form <strong>an</strong>d verify that data collection <strong>an</strong>d coding are being carried out as stipulated. It may<br />

also be worthwhile even to poke around occasionally in file drawers, stacks <strong>of</strong> forms, <strong>an</strong>d among<br />

computer files.<br />

1.4.10 Repeat data <strong>an</strong>alyses<br />

The fact that debugging is a major component <strong>of</strong> commercial s<strong>of</strong>tware development<br />

suggests that investigators need to make provisions to detect <strong>an</strong>d correct programming errors or at<br />

least to minimize their impact. One strategy is to have a different programmer replicate <strong>an</strong>alyses<br />

prior to publication. Typically only a small portion <strong>of</strong> <strong>an</strong>alyses that have been carried out end up in<br />

a publication, so this strategy is more economical th<strong>an</strong> m<strong>an</strong>y others, though <strong>of</strong> course if serious<br />

errors misled the direction <strong>of</strong> the <strong>an</strong>alysis much work will have been lost. Replication begins as<br />

close as possible to raw data <strong>of</strong>fers the greatest protection, but more <strong>of</strong>ten other methods are used<br />

to ensure the correctness <strong>of</strong> the creation <strong>of</strong> the first <strong>an</strong>alysis dataset. An object lesson about the<br />

import<strong>an</strong>ce <strong>of</strong> verifying the accuracy <strong>of</strong> data <strong>an</strong>alyses is <strong>of</strong>fered by the following excerpt from a<br />

letter to the New Engl<strong>an</strong>d Journal <strong>of</strong> Medicine (J<strong>an</strong> 14, 1999, p148):<br />

"In the February 5 issue, we reported the results <strong>of</strong> a study … We regretfully report that we<br />

have discovered <strong>an</strong> error in computer programming <strong>an</strong>d that our previous results are incorrect.<br />

… After the error was corrected, a new <strong>an</strong>alysis showed no signific<strong>an</strong>t increase …"<br />

It is a good bet that error reports such as these are the tip <strong>of</strong> the iceberg.<br />

2. Data conversion<br />

Picture stacks <strong>of</strong> questionnaires, stacks <strong>of</strong> medical record abstract forms, lists <strong>of</strong> laboratory<br />

results, photocopies <strong>of</strong> examination results, <strong>an</strong>d the like. Before they c<strong>an</strong> be <strong>an</strong>alyzed, these original<br />

data need to be coded for computerization (even if the volume is small enough <strong>an</strong>d the intended<br />

<strong>an</strong>alyses are simple enough for m<strong>an</strong>ual tabulation, coding is still required). This process c<strong>an</strong> be a<br />

major <strong>an</strong>d arduous undertaking, <strong>an</strong>d may involve the following steps for each data "stream":<br />

1. Preparation <strong>of</strong> a coding m<strong>an</strong>ual stating the codes to be used for each data item <strong>an</strong>d the<br />

decisions to be made in every possible situation that occurs (see sample coding m<strong>an</strong>ual);<br />

_____________________________________________________________________________________________<br />

www.sph.unc.edu/EPID168/ © Victor J. Schoenbach 16. Data m<strong>an</strong>agement <strong>an</strong>d data <strong>an</strong>alysis - 531<br />

rev. 9/27/1999, 10/22/1999, 10/28/1999

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!