24.01.2013 Views

Rob van Hest Capture-recapture Methods in Surveillance - RePub ...

Rob van Hest Capture-recapture Methods in Surveillance - RePub ...

Rob van Hest Capture-recapture Methods in Surveillance - RePub ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Underreport<strong>in</strong>g of tuberculosis <strong>in</strong> England<br />

5.5% and 4.9% respectively. 19,20 The proportion of not notified culture-confirmed<br />

tuberculosis cases <strong>in</strong> England could be an overestimate result<strong>in</strong>g from possible imperfect<br />

record-l<strong>in</strong>kage or, despite our assumption, rema<strong>in</strong><strong>in</strong>g false-positive records <strong>in</strong> the<br />

Laboratory data source.<br />

Limitations due to imperfect record-l<strong>in</strong>kage and false-positive records<br />

Imperfect record-l<strong>in</strong>kage causes misclassification and results <strong>in</strong> observed and estimated<br />

numbers of tuberculosis cases be<strong>in</strong>g too low or too high. Our data show that 94.9% of<br />

the l<strong>in</strong>ked cases have a high likelihood of association score of 3000 po<strong>in</strong>ts or more, and<br />

only 5.1% with such a score were not l<strong>in</strong>ked. This <strong>in</strong>dicates that <strong>in</strong> only a m<strong>in</strong>ority of<br />

candidate-l<strong>in</strong>ks an error of classification could have occurred. This fulfils our purpose of<br />

record-l<strong>in</strong>kage result<strong>in</strong>g <strong>in</strong> unbiased numbers <strong>in</strong> each category, with possibly some<br />

balanced misclassification. The relatively stable annual proportional distribution of<br />

tuberculosis cases and the decreas<strong>in</strong>g annual proportion of unl<strong>in</strong>ked Notification and<br />

Laboratory cases give further confidence <strong>in</strong> the record-l<strong>in</strong>kage software and procedure.<br />

A low positive predictive value of tuberculosis data sources results <strong>in</strong> observed<br />

and estimated numbers of tuberculosis cases be<strong>in</strong>g too high. Lack of specificity of data<br />

sources used <strong>in</strong> capture-<strong>recapture</strong> studies as a limitation to the validity of this method is<br />

previously described. 22.23 Not all tuberculosis cases are def<strong>in</strong>ed by gold standard<br />

laboratory-confirmation and diagnosis can be based on a cl<strong>in</strong>ical <strong>in</strong>tention to treat. The<br />

three data sources used employ different case-def<strong>in</strong>itions, with consequent variations <strong>in</strong><br />

specificity. We demonstrated by cross-validation with additional datasets that failure to<br />

de-notify or re-classify patients with a f<strong>in</strong>al diagnosis of not tuberculosis occurs which will<br />

also reduce positive predictive value.<br />

The population mixture model estimates a proportion of 72% rema<strong>in</strong><strong>in</strong>g falsepositive<br />

cases among unl<strong>in</strong>ked Hospital cases, contribut<strong>in</strong>g to 26.7% false-positive cases<br />

among all Hospital cases, and result<strong>in</strong>g <strong>in</strong> a f<strong>in</strong>al average proportion of true unl<strong>in</strong>ked<br />

Hospital cases of 5.4%. These results are <strong>in</strong> good agreement with comparable recordl<strong>in</strong>kage<br />

studies of tuberculosis <strong>in</strong>cidence <strong>in</strong> the UK and elsewhere, <strong>in</strong>dicat<strong>in</strong>g a plausible<br />

logistic regression model but express<strong>in</strong>g concern about the contribution of unscrut<strong>in</strong>ised<br />

Hospital data sources to accurate estimates of tuberculosis <strong>in</strong>cidence. 8,17,19,20<br />

Limitations due to violation of the underly<strong>in</strong>g capture-<strong>recapture</strong> assumptions.<br />

The capture-<strong>recapture</strong> f<strong>in</strong>d<strong>in</strong>gs have to be placed <strong>in</strong> the context of the<br />

limitations of this study. The assessment of the coverage of the tuberculosis data sources<br />

was based on three-source log-l<strong>in</strong>ear capture-<strong>recapture</strong> models, only valid <strong>in</strong> the absence<br />

of violation of their underly<strong>in</strong>g assumptions: perfect record-l<strong>in</strong>kage (i.e. no<br />

misclassification of records), a closed population (i.e. no immigration or emigration <strong>in</strong> the<br />

time period studied) and a homogeneous population (i.e. no subgroups with markedly<br />

different probabilities to be observed and re-observed). In two-source capture-<strong>recapture</strong><br />

methods one must also assume <strong>in</strong>dependence between data sources (i.e. the probability of<br />

be<strong>in</strong>g observed <strong>in</strong> one data source is not affected by be<strong>in</strong>g (or not be<strong>in</strong>g) observed <strong>in</strong><br />

another). 9 In the three-source capture-<strong>recapture</strong> approach dependencies between two data<br />

sources can be identified and <strong>in</strong>corporated <strong>in</strong> the log-l<strong>in</strong>ear model. The three-way<br />

119

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!