A Clinical Evaluation of Various Delta Check ... - Clinical Chemistry
A Clinical Evaluation of Various Delta Check ... - Clinical Chemistry
A Clinical Evaluation of Various Delta Check ... - Clinical Chemistry
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
CLIN. CHEM. 27/1, 5-9 (1981)<br />
A <strong>Clinical</strong> <strong>Evaluation</strong> <strong>of</strong> <strong>Various</strong> <strong>Delta</strong> <strong>Check</strong> Methods<br />
Lawrence A. Wheeler1 and Lewis B. Sheiner2<br />
To evaluate the performance <strong>of</strong> delta check techniques,<br />
we analyzed 707 unselected pairs <strong>of</strong> continuous-flow test<br />
results, using three different delta check methods. If any<br />
<strong>of</strong> the test results (plus the urea nitrogen/creatinine ratio<br />
and the anion gap) failed one <strong>of</strong> the checks, the reason for<br />
the failure was sought by examining subsequent test results,<br />
retesting specimens, and (or) reviewing the patient’s<br />
chart. Each delta check failure was accordingly classified<br />
as a true or false positive. The percentage <strong>of</strong> positives we<br />
judged to be true positives ranged from 5 to 29%. Each<br />
<strong>of</strong> the three methods had test types with low and high<br />
percentages <strong>of</strong> true positives. We conclude that with the<br />
delta check methods one can detect errors otherwise<br />
overlooked, but at the cost <strong>of</strong> investigating many false<br />
positives, because, in the population we studied, disease<br />
processes or therapy <strong>of</strong>ten caused large changes in a<br />
series <strong>of</strong> test results for a patient.<br />
The concept <strong>of</strong> using prior test results from a patient to<br />
determine whether a newly obtained test result is likely to be<br />
in error (“delta checks”) is very attractive (1-4).<br />
First, it is a direct approach in which the test results <strong>of</strong> interest<br />
are evaluated rather than an indirect method such as<br />
traditional quality-control techniques. The latter methods<br />
only evaluate the performance <strong>of</strong> the test procedure on a<br />
quality-control specimen. Errors in specimen identification,<br />
test performance, and test-result reporting for the clinical<br />
specimen cannot be detected.<br />
Second, the magnitude <strong>of</strong> the delta can be so chosen that<br />
the delta check will always fail when the change, if it is not<br />
artifactual, is clinically important. This process will alert<br />
laboratory personnel to test results that, if incorrect, could<br />
result in inappropriate therapy. If an effective procedure is<br />
implemented to follow up delta check failures, two important<br />
advantages are realized. Many or all (depending on the extent<br />
<strong>of</strong> the follow-up procedure) incorrect test results (where the<br />
error has resulted in a test result that differs significantly from<br />
a previous value) can be detected and not released. In addition,<br />
those test results that represent actual clinically important<br />
changes and pass the follow-up procedure for laboratory<br />
delta check failure can be so indicated on the test-results<br />
report, thus alerting the clinician to clinically important<br />
changes and increasing his or her confidence that these<br />
changes are not simply laboratory errors. This should eliminate<br />
some unnecessary retesting and allow appropriate clinical<br />
steps to be taken more promptly.<br />
These potential benefits <strong>of</strong> delta check techniques have<br />
prompted proposals for their adoption by many groupsincluding<br />
the College <strong>of</strong> American Pathologists, which in their<br />
Inspection and Accreditation Program specifies the absence<br />
<strong>of</strong> delta check techniques to be a Phase I (minor) deficiency<br />
for laboratories with laboratory computer systems. Unfortu-<br />
Department <strong>of</strong> Pathology, Indiana University, Indianapolis, IN<br />
46223.<br />
2 Department <strong>of</strong> Medicine, Division <strong>of</strong> <strong>Clinical</strong> Pharmacology, and<br />
Department <strong>of</strong> Laboratory Medicine, University <strong>of</strong> California, San<br />
Francisco, CA 94143.<br />
Received July 14, 1980; accepted Sept. 5, 1980.<br />
nately, while the concept <strong>of</strong> delta checking has been accepted,<br />
no clinical trial has tested the relative effectiveness <strong>of</strong> the delta<br />
check methods that have been proposed for clinical chemistry<br />
tests.<br />
Our purpose was to evaluate the three currently proposed<br />
delta check procedures (1-3) that are applicable to some or<br />
all <strong>of</strong> the SMA 6 continuous-flow analysis results. This evaluation<br />
involved using subsequent test results, repeat determinations,<br />
and chart review to classify delta check failures into<br />
true positives (errors made in specimen identification, test<br />
performance, or test result reporting) and false positives<br />
(changes ascribable to physiological responses to disease or<br />
therapy).<br />
Materials and Methods<br />
The test results used in this study were collected with the<br />
clinical laboratory Community Health Computing Laboratory<br />
computer system <strong>of</strong> the California Medical Center. For five<br />
consecutive days (Monday-Friday) all <strong>of</strong> the SMA 6 (Technicon<br />
Instruments Corp., Tarrytown, NY 10591) tests done<br />
during the morning hours were evaluated by using the delta<br />
check methods <strong>of</strong> Ladenson (1), Whitehurst et al. (2), and<br />
Wheeler and Sheiner (3). The Wheeler-Sheiner method uses<br />
points on two probability density functions <strong>of</strong> delta values.<br />
One probability density function was obtained when the two<br />
results used to form the delta were 0.9 to 1.5 days apart and<br />
the other when they were 1.5 to 2.5 days apart. To allow this<br />
method to be evaluated, we included in the study only thpse<br />
SMA 6 results for which another set <strong>of</strong> SMA 6 results were<br />
obtained 0.9 to 2.5 days previously. In addition, the set <strong>of</strong><br />
values from the probability density function that corresponded<br />
to the 0.05 and 0.95 points were used in the study (i.e.,<br />
nominally 10% <strong>of</strong> any particular group test results could be<br />
expected to fail the Wheeler-Sheiner delta check).<br />
We designed an algorithm to classify each test result that<br />
failed a delta check as a true or false positive. A test result that<br />
fails a delta check because <strong>of</strong> an actual change in the patient’s<br />
analyte value was defined to be a false positive. A positive that<br />
was due to any other reason was defined as a true positive (i.e.,<br />
an error was made in identifying the specimen in the patient-care<br />
area, a specimen-identification error had occurred<br />
in the laboratory, the SMA 6 had malfunctioned, or the test<br />
result was incorrectly entered into the laboratory computer).<br />
Note that this algorithm operates at the individual test (e.g.,<br />
Na+) level. If two or more test results from a specimen failed<br />
delta checks, each was independently classified as a true or<br />
false positive.<br />
We believed that having laboratory personnel immediately<br />
collect another specimen and re-run it on the SMA 6 would<br />
be the best approach to determine whether or not a test result<br />
that failed a delta check was an error, although in some cases<br />
even this process might not yield definitive evidence.<br />
The first step in the algorithm (see Figure 1) is an approximation<br />
<strong>of</strong> this method. In the discussion that follows, the<br />
previous test result that was compared to the current test<br />
result will be designated TR1; the current test result will be<br />
TR2; and a subsequent test result, obtained within 24 h, TR3.<br />
If a TR3 was available, an arbitrary rule was used to judge<br />
whether it indicated that TR2 represented an actual change<br />
in the patient’s serum, not an error: if TR3 was nearer to TR2<br />
CLINICAL CHEMISTRY, Vol. 27, No. 1. 1981 5
‘iv 103 Obtained<br />
Fig. 1. <strong>Delta</strong> check evaluation algorithm<br />
Yes False Positive<br />
No ,True Positive<br />
False Positive<br />
than to TR1, TR2 was considered to be correct and the delta<br />
check failure was specified to be a false positive.<br />
If the TR3 was closer to TR1 than to TR2, or if a TR3 was<br />
not obtained, the second step in the algorithm was carried out.<br />
This step included performing a repeat SMA 6 determination<br />
on the TR2 specimen (when there was sufficient specimen).<br />
The results <strong>of</strong> this determination is designated TR2R. When<br />
the difference between TR2 and TR2R exceeded three times<br />
the standard deviation <strong>of</strong> the corresponding test method, a<br />
laboratory test-performance error was judged to have occurred<br />
and the delta check failure was designated a true positive. If<br />
TR2R validated TR2 or if sufficient specimen was not available<br />
to perform a repeat determination, the third step <strong>of</strong> the<br />
algorithm was performed.<br />
The third step <strong>of</strong> the algorithm was to review the patient’s<br />
chart. This review focused on a search for the etiology <strong>of</strong> the<br />
delta. Examples <strong>of</strong> some clinical situations that we accepted<br />
as being the cause <strong>of</strong> large changes in SMA 6 test results<br />
are:<br />
1. Renal dialysis done between the time the two specimens<br />
were drawn.<br />
2. Potassium supplementation given (as a reason for a<br />
increase).<br />
3. Recent renal transplant as a reason for decreasing urea<br />
nitrogen and creatinine (normal pattern) or increasing<br />
urea nitrogen and creatinine (rejection).<br />
4. Intravenous therapy with an electrolyte-containing<br />
fluid.<br />
Because we investigated only those test results that failed<br />
one or more <strong>of</strong> the delta check methods, we cannot divide the<br />
test results that did not fail a particular delta check into true<br />
and false negatives with total accuracy. However, in several<br />
cases a test result that did not fail one method did fail one or<br />
both <strong>of</strong> the other methods and was therefore investigated, so<br />
we can assign these delta check failures to be true or false<br />
negatives. Further, if it is assumed that all “medically significant”<br />
changes in test result values will be detected by one<br />
<strong>of</strong> the three methods, true and false negative percentages can<br />
be computed for all the test result delta check combinations<br />
except urea nitrogen/creatinine ratio and anion gap, which<br />
were evaluated only by the Wheeler-Sheiner method. This<br />
assumption yields a lower bound for the false-negative percentage<br />
rates, because two classes <strong>of</strong> undetected test result<br />
errors will not be included. The first is the “small” error (e.g.,<br />
reporting a K result that was actually determined to be 3.2<br />
mmolfL as 3.4 mmol/L). This clearly is an error, but it should<br />
6 CLINICAL CHEMISTRY, Vol. 27, No. 1, 1981<br />
liv<br />
No<br />
No<br />
No judgenmnt<br />
made<br />
UN Tests Failing The<br />
Wheeler-Shelner Method (27)<br />
True Positive<br />
No (I)<br />
False Positives<br />
True Positives<br />
False Positives<br />
Fig. 2. <strong>Delta</strong> check evaluation algorithm results for the<br />
Wheeler-Sheiner delta check methodapplied to urea nitrogen<br />
test results<br />
have no impact on patient case. Another example <strong>of</strong> this type<br />
<strong>of</strong> error would be a switch <strong>of</strong> labels on specimens from two<br />
patients with “normal” values for electrolytes, urea nitrogen,<br />
and creatinine. All the results would be in error, but they<br />
might not differ sufficiently to trigger a delta check failure.<br />
The second type <strong>of</strong> test result error that would not be included<br />
in the false-negative calculation is one in which the true<br />
test result differs greatly from the previous test result, but the<br />
erroneous test result is nearly the same. For example, yesterday’s<br />
K+ result was 4.0 mmol/L and the true value for today’s<br />
K+ value is 6.0 mmol/L, but an error is made such that<br />
a value <strong>of</strong> 3.9 mmol/L is reported. <strong>Delta</strong> check methods are<br />
by definition unable to detect this type <strong>of</strong> error.<br />
Despite the above problems, we believe that method performance<br />
estimates based on this assumption are useful, because<br />
“small” errors are not <strong>of</strong> clinical importance and the<br />
second type <strong>of</strong> error should be relatively rare.<br />
Results<br />
A total <strong>of</strong> 707 sets <strong>of</strong> SMA 6 results (including the urea nitrogen/creatinine<br />
ratio and anion gap) were included in the<br />
study. Of these, 253 had an SMA 6 analysis performed within<br />
0.9 to 2.5 days previously and therefore satisfied the criterion<br />
for delta check evaluation. Of these 253 sets <strong>of</strong> SMA 6 results,<br />
150 (59%) had at least one test result that failed one or more<br />
<strong>of</strong> the three delta check methods. The algorithm described in<br />
Figure 1 was followed in all the cases. Figure 2 presents the<br />
results <strong>of</strong> this process for urea nitrogen test results that failed<br />
the Wheeler-Sheiner delta check. A total <strong>of</strong> 27 test results<br />
failed the Wheeler-Sheiner delta check. Of these, 17 were<br />
judged to be false positives because the result <strong>of</strong> a determination<br />
made on a specimen on the next day (TR3) was nearer<br />
to TR2 than TR1. Three results had TR3 values nearer to TR1<br />
than TR2, and in seven cases a urea nitrogen determination<br />
was not done the next day. The 10 specimens in the two latter<br />
groups were candidates for repeat tests (i.e., obtaining TR2R<br />
values).<br />
In one case the repeat determination was not done. The<br />
reason repeat determinations were not done was not recorded<br />
for each case; however, the reason was usually that an insufficient<br />
volume <strong>of</strong> specimen was available. In two cases the<br />
TR2R value exceeded three test-method standard deviations<br />
from TR2. In these cases the repeat value was judged to in-
dicate that the TR2 value was in error and therefore these two<br />
cases were deemed to represent true positives (i.e., an error<br />
had taken place in the performance or reporting <strong>of</strong> the test<br />
yielding TR2). In seven cases the absolute magnitude <strong>of</strong> the<br />
difference between TR2 and TR2R was less than or equal to<br />
three test-method standard deviations.<br />
In the eight cases for which a judgment could not be made<br />
based on repeat test values, the third step in the algorithm<br />
(chart review) was performed, if possible. In one case the chart<br />
was not reviewed, because the chart could not be located at<br />
the time. In six cases a clinical reason for the change in the test<br />
results was found on chart review. These six cases were judged<br />
to represent false positives-i.e., an actual physiological<br />
change had occurred in the patient. In the remaining case we<br />
could find no reason for the change in the chart and therefore<br />
this case was specified to be a true positive.<br />
Table 1 presents the performance <strong>of</strong> the three delta check<br />
methods in this study. The true and false positive results were<br />
obtained by use <strong>of</strong> the algorithm described and illustrated<br />
above.<br />
The “No judgment made” column lists the number <strong>of</strong> test<br />
results <strong>of</strong> each type that failed one <strong>of</strong> the delta checks and that<br />
we were unable to assign to one <strong>of</strong> the other classes. This<br />
amounts to 13 <strong>of</strong> 193 (7%) <strong>of</strong> the Wheeler-Sheiner method<br />
values, nine <strong>of</strong> 91(9%) <strong>of</strong> the Ladenson method values, and<br />
seven <strong>of</strong> 141 (5%) <strong>of</strong> the Whitehurst method values.<br />
The “Predictive value” column presents data on the relative<br />
efficiency <strong>of</strong> the methods in practice. The values range from<br />
a low <strong>of</strong> 5% for K by the Whitehurst method to 29% for creatinine<br />
in the Wheeler-Sheiner method.<br />
The “Error incidence rate” column presents information<br />
on the (inferred) error rate for the test included in the study.<br />
The error rate ranged from 1.2% for Na to 4% for urea nitrogen.<br />
We examined the values <strong>of</strong> the deltas that we judged to be<br />
true or false positives, to see if different choices <strong>of</strong> the delta<br />
check limits would yield better performance. For example, it<br />
could have been the case that most or all <strong>of</strong> the deltas that<br />
were judged to be true positives were larger in magnitude than<br />
the false positives. We found that if the delta check limits were<br />
selected to be as large as possible in magnitude while still<br />
having all the true positives fail the delta check (e.g., for Na<br />
the three deltas which were judged to be true positives were<br />
7, -11, and -11 mmol/L and therefore the Na delta check<br />
limits were selected to be -10 and 6 mmol/L), that the delta<br />
check performance was not greatly improved. The percentage<br />
<strong>of</strong> the deltas failing this delta check that corresponded to true<br />
positives ranged from 14% for K to 37% for creatinine. These<br />
results show that, for the population considered in this study,<br />
no adjustment in the delta check limits would have resulted<br />
in greatly reduced false-positive rates.<br />
An important but little-discussed version <strong>of</strong> the delta check<br />
is to combine results <strong>of</strong> delta checks <strong>of</strong> several separate tests<br />
performed on any single specimen, to determine whether a<br />
specimen identification error has occurred. Table 2 presents<br />
the data obtained in this study. Each array <strong>of</strong> combinations<br />
<strong>of</strong> true and false positives adds up to 150, the total number <strong>of</strong><br />
specimens for which one or more test results failed one or more<br />
<strong>of</strong> the delta check methods. The entry under zero false-positives<br />
and zero true-positives represents the number <strong>of</strong> specimens<br />
that had no test results failing that delta check method.<br />
Not surprisingly, the number <strong>of</strong> specimens in this cateogry is<br />
inversely related to the number <strong>of</strong> tests included in the delta<br />
check method.<br />
The Wheeler-Sheiner method has a total <strong>of</strong> 16 specimens<br />
with only true positives, five with both true and false positives,<br />
and 91 with only false positives. If a specimen is judged to be<br />
a true positive if at least one true positive is detected for it,<br />
then the percentage <strong>of</strong> positive specimens with true positives<br />
is 19% (21/112). This means that in this study approximately<br />
one fifth <strong>of</strong> the specimens that had one or more tests fail the<br />
Wheeler-Sheiner delta check were found to have at least one<br />
test result in error. By the Ladenson method the true positive<br />
rate was 16% (11/70), by the Whitehurst method 18% (16/<br />
90).<br />
Discussion<br />
We evaluated the performance <strong>of</strong> three delta check methods<br />
in clinical use by applying them to 707 sets <strong>of</strong> SMA 6 results<br />
and attempting to determine the etiology <strong>of</strong> each delta check<br />
failure (Figure 1).<br />
Examination <strong>of</strong> the column <strong>of</strong> Table 1 that gives the total<br />
number <strong>of</strong> tests failing the delta check tells us about the relative<br />
stringency <strong>of</strong> the delta check methods. In general, the<br />
three delta check methods prescribe a more complicated decision<br />
procedure than just being positive whenever a maximum<br />
difference between the current value and the most recent<br />
previous value is exceeded. For example, for the serum<br />
urea nitrogen (UN) test, the ranges used by the three methods<br />
are given below:<br />
Method<br />
Wheeler-Sheiner<br />
<strong>Delta</strong> failure values<br />
-7 or A> 3<br />
- 11 or A> 8<br />
A< -38or A> 39<br />
UN1 - UN2I<br />
Ladenson I >0.5<br />
UN1 I<br />
UN2 59<br />
Whitehurst<br />
IUN1 - UN2I >0.5<br />
UN2 25<br />
IUN1 - UN2<br />
I UN2<br />
>0.25 UN2 >25<br />
Clearly the Whitehurst rule is the most stringent and the<br />
Ladenson rule the least stringent. The Wheeler-Sheiner rule<br />
is more complex and falls within these two extremes. This<br />
property <strong>of</strong> the rules is reflected by the total number <strong>of</strong> UN<br />
tests failing each method (Ladenson 10, Wheeler-Sheiner 25,<br />
and Whitehurst 56).<br />
Next consider the true-positive and false-positive columns<br />
<strong>of</strong> Table 1. These values indicate the yield in the error detection<br />
<strong>of</strong> each <strong>of</strong> the test/delta check method combinations. The<br />
predictive value column <strong>of</strong> Table 1 gives the percentage <strong>of</strong><br />
positives that we judged to be true positives. The values range<br />
from very low values for K by the Whitehurst method (5%),<br />
Na by the Wheeler-Sheiner method (6%), K by the<br />
Wheeler-Sheiner method (6%), and urea nitrogen by the<br />
Ladenson method (6%) to relatively high values for chloride<br />
by the Whitehurst method (20%), bicarbonate by the<br />
Wheeler-Sheiner method (20%), urea nitrogen/creatinine<br />
ratio by the Wheeler-Sheiner method (28%), and creatinine<br />
by the Wheeler-Sheiner method (29%). Note that with even<br />
the best <strong>of</strong> these, more than 70% <strong>of</strong> the delta check failures are<br />
false positives. This result is a consequence <strong>of</strong> the fact that in<br />
the patient population to which these methods were applied<br />
(i.e., patients for whom two SMA 6’s were ordered within 2.5<br />
days), large variations in these types <strong>of</strong> test results are commonly<br />
seen that are attributable to disease or therapy. These<br />
results indicate the price in follow-up <strong>of</strong> false positives that<br />
the laboratory will have to pay to detect errors, when it is<br />
possible to do so by using delta check methods.<br />
It is entirely possible that if our patient population had been<br />
composed <strong>of</strong> predominantly healthy people (e.g., marines<br />
taking periodic physical examinations) or <strong>of</strong> patients with<br />
relatively well-controlled diseases, (e.g., medicine clinic patients)<br />
the results would have been different. In such popu-<br />
CLINICAL CHEMISTRY, Vol. 27, No. 1, 1981 7
<strong>Delta</strong><br />
No. tests<br />
No Error<br />
check<br />
Test failing the<br />
True<br />
False<br />
False True ludgment Predictive incidence<br />
method<br />
type delta check _____________________ positives<br />
positives negatives negatives made value, % rate, %<br />
TR2R No reason TR3 Reason In<br />
Indicates for delta confirms chart for<br />
error In chart TR2 delta<br />
Wheeler- Na’ 18 0 13 4 2 233 0 6 1.2<br />
Sheiner N 35 0 2 24 6 4 214 3 6 2.4<br />
Cl 24 3 0 15 3 1 228 3 14 1.6<br />
HC03 25 3 2 14 5 0 228 1 21 2.0<br />
UN 27 2 17 6 7 219 1 12 4.0<br />
Cr 23 6 0 10 5 1 229 2 29 2.8<br />
UN/Cr 26 5 2 14 4 a 227 1 a a<br />
anion gap 15 1 0 8 4 a 238 2 a a<br />
Ladenson Na 14 1 1 10 2 1 238 0 14 1.2<br />
K 49 1 5 26 10 0 204 7 12 2.4<br />
UN 17 0 1 13 2 9 227 1 6 4.0<br />
Cr 11 2 0 8 0 5 237 1 20 2.8<br />
Whitehurst Na 18 1 2 11 4 0 235 0 17 1.2<br />
K 22 0 1 15 4 5 226 2 5 2.4<br />
CI- 10 1 1 5 3 2 241 0 20 1.6<br />
HC03 17 2 1 10 3 2 234 1 19 2.0<br />
UN 60 5 5 36 10 0 193 4 18 4.0<br />
Cr 14 2 0 11 1 5 234 0 14 2.8<br />
v False negatives could not be detected for these test results. Therefore these calculations were not performed.<br />
CR, creatinine: UN, serum urea nitrogen.<br />
lations the large changes in test values that we attributed to<br />
disease processes or therapy in the current study should be<br />
infrequent.<br />
Examination <strong>of</strong> the true-positive and false-negative columns<br />
<strong>of</strong> Table 1 indicates that relatively few test results <strong>of</strong><br />
each type that we evaluated were judged to be in error. The<br />
error incidence rate column gives the exact values. The range<br />
<strong>of</strong> error incidence, from 1.2 to 4.0%, is somewhat higher than<br />
laboratorians would expect to find, and may be due to our<br />
algorithm “overdiagnosing” changes as being due to errors<br />
rather than to physiology.<br />
pathologist for review. What happened next should depend<br />
on the pathologist’s judgment <strong>of</strong> the potential adverse effect<br />
<strong>of</strong> the test result being in error. Follow-up actions could include<br />
repeating the test or calling the physician who ordered<br />
the test to determine if the patient’s clinical condition or<br />
therapy provided an explanation for the large change in the<br />
test result value.<br />
If it is decided to release the test result, it should be marked<br />
with an appropriate symbol so that the clinician will be aware<br />
TP/specim.n FP/soeclmen<br />
that it represents a large change, that it has been reviewed in<br />
0 1 2 3 4 5 the laboratory by a pathologist, and that the review process<br />
0 38 58 21 9 2 1 did not reveal that the test result was in error.<br />
1 11 0 2 1 0 0 We can consider the performance <strong>of</strong> delta check methods<br />
Table 2. Specimen Level True Positive (TP) and<br />
False Positive (FP) Results<br />
Wheeler-Sheiner<br />
method (3)<br />
Ladenson<br />
method(1)<br />
Whitehurst<br />
method (2)<br />
8 CLINICAL CHEMISTRY, Vol. 27, No. 1, 1981<br />
Table 1. <strong>Delta</strong> <strong>Check</strong> Methods: Performance Characteristics<br />
One approach to utilizing delta checks in a computerized<br />
laboratory would be to have a message that the delta check has<br />
failed provided to the technologist at the time the test result<br />
is recorded. The technologist would check for transcription<br />
errors at that time. If the technologist could not find a reason<br />
for the delta check failure, the result should be referred to a<br />
2 5 0 1 1 0 0 that require two or more test results to fail delta checks for the<br />
3 0 0 0 0 0 0 specimen to fail a delta check by examining Table 2. Interestingly,<br />
many specimens had more than one test fail a delta<br />
check. By the Wheeler-Sheiner method, 43 (38%) <strong>of</strong> the<br />
specimens that failed one delta check also failed two or more.<br />
0 80 47 12 0 0 0 The Whitehurst method had 43% (39/90) and the Ladenson<br />
1 11 0 0 0 0 0 method had 17% (12/70). Consider a delta check rule that<br />
2 0 0 0 0 0 0 would specify that a specimen be judged as failing a delta<br />
3 o 0 0 0 0 0 check if two or more test results fail delta checks. Using this<br />
rule with the Wheeler-Sheiner method results, we find that<br />
23% (10/43) <strong>of</strong> our specimens fail this specimen level delta<br />
check and include at least one test result delta check failure<br />
0 60 51 13 7 2 0 that was a true positive. This is a slight improvement over the<br />
1 10 1 3 0 0 0 rule that one test is needed to fail a delta check; however, this<br />
2 2 0 0 0 0 0 rule missed 52% (11/21) <strong>of</strong> the specimens that had at least one<br />
3 1 0 0 0 0 0 true positive. The number <strong>of</strong> specimens that would have to<br />
_______________________________ be evaluated has been reduced from 112 to 43; on the other
hand, missing 11 <strong>of</strong> 21 specimens with erroneous test results<br />
is clearly undesirable.<br />
In a recent paper (4) we evaluated the performance <strong>of</strong> delta<br />
check methods by using a simulation approach and found on<br />
using the Wheeler-Sheiner method as discussed above (i.e.,<br />
a specimen fails if two or more <strong>of</strong> the eight test results fail delta<br />
checks) that the true.positive rate was 84% and the falsepositive<br />
rate was 20%. Therefore, if the specimen errors in this<br />
study were <strong>of</strong> the same type studied previously (i.e., mislabeled<br />
specimens) we would expect 44 specimens to have failed<br />
the “at least two <strong>of</strong> eight” delta check (18 true positives and<br />
26 false positives). In fact 43 specimens failed (10 true positives<br />
and 33 false positives), most likely owing to the fact that<br />
many errors other than specimen mislabeling can occur. In<br />
general, specimen mislabeling causes all the test results to be<br />
in error, while other error sources, such as machine malfunction<br />
and test result transcription errors, tend to cause only one<br />
to be in error. Therefore the latter errors will not be detected<br />
by a delta check method that requires two or more test results<br />
to fail individual delta checks. Note that 11 specimens with<br />
one true positive and no false positives are missed by using this<br />
rule. These specimens probably represent cases <strong>of</strong> the second<br />
type <strong>of</strong> error.<br />
In summary, we show that the three delta check methods<br />
as applied to individual tests can detect erroneous test results,<br />
but unfortunately deltas <strong>of</strong> similar magnitude occurred as a<br />
result <strong>of</strong> disease or therapy two to 15 times as <strong>of</strong>ten as those<br />
due to errors (for this group <strong>of</strong> tests and this patient population).<br />
This result limits the efficiency <strong>of</strong> delta check methods<br />
in this setting, because the major effort in delta check failure<br />
follow-up would be spent on false positives. Nevertheless, we<br />
believe that delta check methods can serve a useful role. First,<br />
they detect errors that escape with standard quality control<br />
techniques-and <strong>of</strong> course a primary goal <strong>of</strong> a clinical laboratory<br />
is to eliminate erroneous results. Second, flagging test<br />
results that fail the delta but then pass the laboratory’s review<br />
procedure should increase the clipician’s confidence in these<br />
results and decrease unnecessary repeat tests.<br />
References<br />
1. Ladenson, J. H., Patients as their own controls: Use <strong>of</strong> the computer<br />
to identify “laboratory error.” Clin. Chem. 21, 1648-1653<br />
(1975).<br />
2. Whitehurst, P., DeSilvio, T. V., and Boyadjian, G., <strong>Evaluation</strong> <strong>of</strong><br />
discrepancies in patients’ results-an aspect <strong>of</strong> computer-assisted<br />
quality control. Clin. Chem. 21,87-92 (1975).<br />
3. Wheeler, L. A., and Sheiner, L. B., <strong>Delta</strong> check tables for the<br />
Technicon SMA 6 continuous-flow analyzer. Clin. Chern. 23, 216-219<br />
(1977).<br />
4. Sheiner, L. B., Wheeler, L. A., and Moore, J. K., The performance<br />
<strong>of</strong> delta check methods. Gun. Chem. 25, 2034-2037 (1979).<br />
CLINICAL CHEMISTRY, Vol. 27, No. 1, 1981 9