08.12.2012 Views

Scientific Concept of the National Cohort (status ... - Nationale Kohorte

Scientific Concept of the National Cohort (status ... - Nationale Kohorte

Scientific Concept of the National Cohort (status ... - Nationale Kohorte

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

A.6 Planned statistical analyses and statistical power considerations<br />

YLD = # <strong>of</strong> incident cases in that period × average duration <strong>of</strong> <strong>the</strong> disease × disability weight.<br />

The disability weight reflects <strong>the</strong> severity <strong>of</strong> <strong>the</strong> disease on a scale from 0 (perfect health)<br />

to 1 (death) 800 .<br />

The DALY concept has been used relatively little in epidemiologic research in Germany. Terschüren<br />

et al. (2009) 801 estimated <strong>the</strong> future burden <strong>of</strong> disease using <strong>the</strong> population forecast<br />

with respect to to age and sex distribution only. In <strong>the</strong> <strong>National</strong> <strong>Cohort</strong>, long-term observations<br />

can be used to assess <strong>the</strong> effect <strong>of</strong> changing risk factor patterns on population health.<br />

A.6.2.4 risk prediction models<br />

In risk prediction modeling, a first step will be to develop a discriminative RR model that<br />

includes <strong>the</strong> relevant predictor variables. For an analysis within <strong>the</strong> full cohort, ”time to disease<br />

onset” can be analyzed by means <strong>of</strong> a Cox-proportional-hazards regression. In a nested<br />

case-control setting with a (bivariate) disease outcome, a risk prediction model can be<br />

derived by means <strong>of</strong> a logistic regression model. Within such models new markers can be<br />

combined with established risk information to assess <strong>the</strong>ir significance and <strong>the</strong>ir relevance<br />

with respect to risk prediction. The general goal <strong>of</strong> risk prediction models will be to integrate<br />

<strong>the</strong> information from a potentially large series <strong>of</strong> risk factors; thus, model selection procedures<br />

such as stepwise-model selection or boosting need to be applied 802 . The choice <strong>of</strong> a<br />

model selection method will depend on <strong>the</strong> specific situation regarding, for example, data<br />

quality and existence <strong>of</strong> prior, additional evidence. A relevant feature in a model-building<br />

process is ”overfitting” as <strong>the</strong> effect <strong>of</strong> a statistical model to grasp aspects that are unique<br />

to <strong>the</strong> study sample. By incorporating cross-validation or bootstrapping methods into <strong>the</strong><br />

model selection process we can keep this effect under control 803 .<br />

An important criterion for developing a risk prediction model is its ”discriminative quality.”<br />

The discriminative quality <strong>of</strong> a risk prediction model can be measured in terms <strong>of</strong> <strong>the</strong> ”area<br />

under <strong>the</strong> receiver operator curve” (AUROC), which summarizes <strong>the</strong> sensitivity <strong>of</strong> a risk<br />

prediction model in relation to one-minus-specificity across all possible cut-<strong>of</strong>f points. For<br />

comparison <strong>of</strong> different risk models (e.g., a conventional model and an extended model including<br />

new markers), <strong>the</strong> difference in <strong>the</strong> models’ AUROC values can be tested 804 . In <strong>the</strong><br />

presence <strong>of</strong> established risk limits <strong>the</strong> ”net reclassification gain” (NRI) can be calculated as<br />

an alternative measure to compare classification resulting from different risk models. As an<br />

extension to this idea <strong>of</strong> capture reclassification, <strong>the</strong> ”integrated discrimination index” (IDI)<br />

summarizes <strong>the</strong> general improvement in discrimination. These alternative statistics provide<br />

valuable additional insight into <strong>the</strong> discriminative quality <strong>of</strong> risk models 805 .<br />

Finally, a fur<strong>the</strong>r important step in constructing risk prediction models involves translating<br />

RRs into predicted absolute risk levels. Absolute risk thresholds may be interpreted more<br />

directly in terms <strong>of</strong> potential risks and benefits if <strong>the</strong> model is used to identify subjects for<br />

interventions or treatments. With a cohort design absolute risk levels can be estimated<br />

within <strong>the</strong> study 806 . External data from registries can be included as an additional source.<br />

The two sources can be used to validate or optimize calibration <strong>of</strong> a discriminative model 807 .<br />

In addition to case numbers, <strong>the</strong> cohort can provide valuable information on <strong>the</strong> distribution<br />

<strong>of</strong> some, if not all, risk factors. To which extent <strong>the</strong> cohort is representative <strong>of</strong> <strong>the</strong> general<br />

population may have to be critically evaluated at this point.<br />

A.6.2.5 Statistical analyses for studies embedded into <strong>the</strong> <strong>National</strong> <strong>Cohort</strong><br />

In situations in which data from <strong>the</strong> full cohort are not available due to financial, organizational,<br />

or o<strong>the</strong>r constraints, “hybrid” designs will be used. These can be properly defined<br />

175<br />

A.6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!