Diagnostic accuracy of dermoscopy - Dermatology
Diagnostic accuracy of dermoscopy - Dermatology
Diagnostic accuracy of dermoscopy - Dermatology
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Dermoscopy<br />
The <strong>accuracy</strong> <strong>of</strong> the clinical diagnosis <strong>of</strong> cutaneous<br />
melanoma with the unaided eye is only about 60%.<br />
Dermoscopy, a non-invasive, in vivo technique for the<br />
microscopic examination <strong>of</strong> pigmented skin lesions,<br />
has the potential to improve the diagnostic <strong>accuracy</strong>.<br />
Our objectives were to review previous publications,<br />
to compare the <strong>accuracy</strong> <strong>of</strong> melanoma diagnosis with<br />
and without <strong>dermoscopy</strong>, and to assess the influence<br />
<strong>of</strong> study characteristics on the diagnostic <strong>accuracy</strong>.<br />
We searched for publications between 1987 and 2000<br />
and identified 27 studies eligible for meta-analysis.<br />
The diagnostic <strong>accuracy</strong> for melanoma was<br />
significantly higher with <strong>dermoscopy</strong> than without<br />
this technique (log odds ratio 4.0 [95% CI 3.0 to 5.1]<br />
versus 2.7 [1.9 to 3.4]; an improvement <strong>of</strong> 49%, p =<br />
0.001). The diagnostic <strong>accuracy</strong> <strong>of</strong> <strong>dermoscopy</strong><br />
significantly depended on the degree <strong>of</strong> experience <strong>of</strong><br />
the examiners. Dermoscopy by untrained or less<br />
experienced examiners was no better than clinical<br />
inspection without <strong>dermoscopy</strong>. The diagnostic<br />
performance <strong>of</strong> <strong>dermoscopy</strong> improved when the<br />
diagnosis was made by a group <strong>of</strong> examiners in<br />
consensus and diminished as the prevalence <strong>of</strong><br />
melanoma increased. A comparison <strong>of</strong> various<br />
diagnostic algorithms for <strong>dermoscopy</strong> showed no<br />
significant differences in their diagnostic<br />
performance. A thorough appraisal <strong>of</strong> the study<br />
characteristics showed that most <strong>of</strong> the studies were<br />
potentially influenced by verification bias. In<br />
conclusion, <strong>dermoscopy</strong> improves the diagnostic<br />
<strong>accuracy</strong> for melanoma in comparison with<br />
inspection by the unaided eye, but only for<br />
experienced examiners.<br />
Lancet Oncol 2002; 3: 159–65<br />
Early diagnosis is thought to be very important for<br />
improving the prognosis <strong>of</strong> patients with cutaneous<br />
melanoma, but even in specialised centres the <strong>accuracy</strong> <strong>of</strong><br />
the clinical diagnosis for melanoma achieved with the<br />
unaided eye is only slightly better than 60%. 1 Dermoscopy<br />
(epiluminescence microscopy, dermatoscopy, skin-surface<br />
microscopy, incident light microscopy) is a non-invasive, in<br />
vivo examination with a microscope that uses incident light<br />
and oil immersion to make subsurface structures <strong>of</strong> the skin<br />
accessible to visual examination (Figure 1). Dermoscopy<br />
allows the observer to look not only onto but also into the<br />
superficial skin layers, and thus permits a more detailed<br />
inspection <strong>of</strong> pigmented skin lesions. 2 The results <strong>of</strong> several<br />
studies have suggested that <strong>dermoscopy</strong> improves the rate<br />
<strong>of</strong> detection <strong>of</strong> melanoma compared with inspection by the<br />
unaided eye. 3 However, the reported sensitivity and<br />
specificity vary significantly between studies, partly because<br />
the diagnostic <strong>accuracy</strong> <strong>of</strong> <strong>dermoscopy</strong> depends on the<br />
Review<br />
<strong>Diagnostic</strong> <strong>accuracy</strong> <strong>of</strong> <strong>dermoscopy</strong><br />
H Kittler, H Pehamberger, K Wolff, and M Binder<br />
Figure 1. Superficial spreading melanoma viewed with <strong>dermoscopy</strong> (large<br />
panel) and with the unaided eye (inset panel). Compared with the unaided<br />
eye, <strong>dermoscopy</strong> reveals several additional structural features, which are<br />
typical <strong>of</strong> melanoma, including irregular dots and irregular extensions<br />
(pseudopods) in the periphery and a blue-whitish veil.<br />
amount <strong>of</strong> training <strong>of</strong> the dermatologist, the diagnostic<br />
difficulty <strong>of</strong> the lesions, and the type <strong>of</strong> algorithm used for<br />
assessment, 4–6 but also as a result <strong>of</strong> differences in the<br />
explicit or implicit threshold used to differentiate between<br />
melanoma and non-melanoma. We have used the metaanalytic<br />
method for diagnostic tests, which combines data<br />
from many studies, 7,8 takes into account differences in the<br />
test threshold, and provides a way to examine the<br />
association between test <strong>accuracy</strong> and study characteristics,<br />
to compare the diagnostic <strong>accuracy</strong> for melanoma with and<br />
without <strong>dermoscopy</strong>, to assess the influence <strong>of</strong> study<br />
characteristics on the diagnostic <strong>accuracy</strong> <strong>of</strong> <strong>dermoscopy</strong>,<br />
and to report summary estimates <strong>of</strong> the diagnostic <strong>accuracy</strong><br />
by combining data from many reports.<br />
Methods<br />
Eligible studies (see Search strategy and selection criteria)<br />
were classified, with no masking, by two readers in<br />
consensus on prospectively defined characteristics<br />
important for assessment <strong>of</strong> diagnostic tests. The following<br />
information was extracted from each report: authors’<br />
names; year <strong>of</strong> publication; description <strong>of</strong> pigmented skin<br />
lesions (melanoma prevalence, melanoma invasion<br />
thickness, frequency <strong>of</strong> non-melanocytic lesions);<br />
experience <strong>of</strong> examiners; independence <strong>of</strong> clinical and<br />
histological assessment; type <strong>of</strong> diagnostic algorithm; mode<br />
All the authors are at the Department <strong>of</strong> <strong>Dermatology</strong>, Division <strong>of</strong><br />
General <strong>Dermatology</strong>, University <strong>of</strong> Vienna Medical School, Vienna,<br />
Austria. HK is a Research Assistant, HP is a Pr<strong>of</strong>essor, KW is<br />
Pr<strong>of</strong>essor and Chairman, and MB is an Associate Pr<strong>of</strong>essor.<br />
Correspondence: Dr Harald Kittler, Department <strong>of</strong> <strong>Dermatology</strong>,<br />
University <strong>of</strong> Vienna Medical School, Waehringerguertel 18–20,<br />
A-1090 Vienna, Austria. Tel: +43 1 40400 7701.<br />
Fax: +43 1 4081928. E-mail: h.kittler@akh-wien.ac.at<br />
THE LANCET Oncology Vol 3 March 2002 http://oncology.thelancet.com 159<br />
For personal use. Only reproduce with permission from The Lancet Publishing Group.
<strong>of</strong> diagnosis, mode <strong>of</strong> presentation; and results (sensitivity<br />
and specificity). The independence <strong>of</strong> clinical and<br />
histological assessment was defined according to whether<br />
the clinical diagnosis was made without knowledge <strong>of</strong><br />
histology. The diagnostic algorithm refers to the type <strong>of</strong><br />
analysis that was used for the dermoscopic assessment <strong>of</strong><br />
pigmented lesions. We differentiated between pattern<br />
analysis as described by Pehamberger and colleagues, 9 the<br />
ABCD rule for <strong>dermoscopy</strong> reported by Stolz and coworkers,<br />
10,11 and algorithms that used a modified form <strong>of</strong><br />
pattern analysis in conjunction with a scoring system. The<br />
latter group included the 7-point checklist <strong>of</strong> Argenziano<br />
and colleagues, 12 Menzies and co-workers’ scoring<br />
system, 13,14 risk stratification as described by Kenet and<br />
Fitzpatrick, 15 and the seven features <strong>of</strong> melanoma as<br />
described by Benelli and others. 16–18<br />
160<br />
Review<br />
Table 1. Main characteristics <strong>of</strong> eligible studies<br />
Dermoscopy<br />
First author Ref Number <strong>of</strong> NML Dermoscopic Dermoscopic Mode <strong>of</strong> Assessment Mode <strong>of</strong><br />
lesions included experience <strong>of</strong> algorithm presentation independent diagnosis<br />
(% melanomas) examiners<br />
Argenziano 12 342 (34%) No Experts and Scoring system, Images Yes Consensus<br />
non-experts ABCD rule*, pattern<br />
analysis<br />
Bauer 19 279 (15%) No Experts Pattern analysis Patients Yes Consensus<br />
Benelli 17 401 (15%) Yes Experts Scoring system Patients Yes Not recorded<br />
Binder 20 100 (40%) No Experts Pattern analysis Images Yes Consensus<br />
Binder 6 240 (24%) Yes Experts and<br />
non-experts<br />
Pattern analysis Images Yes Individual<br />
Binder 5 100 (37%) Yes Non-experts before<br />
and after training<br />
Pattern analysis Images Yes Individual<br />
Binder 4 250 (16%) No Experts and ABCD rule*, Images Yes Individual<br />
non-experts pattern analysis<br />
Carli 21 15 (27%) No Experts Pattern analysis Images Yes Individual<br />
Crist<strong>of</strong>olini 22 220 (15%) Yes Experts Pattern analysis Patients Yes Consensus<br />
Dal Pozzo 18 713 (22%) Yes Experts Scoring system Images Yes Consensus<br />
Dummer 23 824 (3%) Yes Experts Pattern analysis Patients Yes Not recorded<br />
Feldmann 24 500 (6%) No Experts ABCD rule* Patients Yes Individual<br />
Kittler 25 50 (46%) No Experts Pattern analysis Images Yes Individual<br />
Kittler 26 356 (21%) Yes Experts ABCD rule* Images Yes Individual<br />
Krähn 27 80 (49%) No Experts Not recorded Patients Yes Not recorded<br />
Lorentzen 28 232 (21%) Yes Experts and<br />
non-experts<br />
Pattern analysis Images Not recorded Individual<br />
Lorentzen 29 258 (25%) Yes Experts and Scoring system, Images Not recorded Individual<br />
non-experts ABCD rule<br />
Menzies 13 385 (28%) Yes Experts Scoring system Images Yes Not recorded<br />
Nachbar 11 172 (40%) No Experts ABCD rule* Patients Yes Not recorded<br />
Nilles 30 209 (20%) No Experts Scoring system Not recorded Not recorded Not recorded<br />
Seidenari 31 90 (34%) No Experts and<br />
non-experts<br />
Pattern analysis Images Not recorded Individual<br />
Soyer 32 159 (41%) Yes Experts Pattern analysis Patients Yes Individual<br />
Stanganelli 33 20 (50%) No Experts Pattern analysis Images Yes Individual<br />
Stanganelli 34 3329 (2%) No Experts Pattern analysis Patients Yes Individual<br />
Steiner 35 318 (23%) Yes Experts Pattern analysis Patients Yes Consensus<br />
Stolz 10 79 (61%) No Experts ABCD rule* Images Not recorded Consensus<br />
Westerh<strong>of</strong>f 36 100 (50%) Yes Non-experts before<br />
and after training<br />
Scoring system Images Yes Individual<br />
NML, non-melanocytic skin lesions. *The ABCD rule aids clinical diagnosis <strong>of</strong> melanoma on the basis <strong>of</strong> observable morphological features – asymmetry, border irregularity, colour<br />
variegation, and dermoscopic structure.<br />
For mode <strong>of</strong> diagnosis, we noted whether or not<br />
the diagnosis was established in consensus by a group<br />
<strong>of</strong> examiners, and for mode <strong>of</strong> presentation, we<br />
differentiated between studies that used presentation <strong>of</strong><br />
colour prints, photographs, slides, or digital images and<br />
studies that investigated the <strong>accuracy</strong> <strong>of</strong> face-to-face<br />
diagnosis. Studies were further examined according to<br />
whether their results were potentially influenced by<br />
verification bias. Verification bias is likely when the<br />
decision to proceed with the reference test<br />
(histopathology) partly depends on the results <strong>of</strong> the<br />
clinical diagnosis. The influence <strong>of</strong> verification bias<br />
on the diagnostic <strong>accuracy</strong> was not analysed<br />
statistically because only one study looked at the<br />
outcome <strong>of</strong> benign lesions that were not selected for<br />
excision.<br />
THE LANCET Oncology Vol 3 March 2002 http://oncology.thelancet.com<br />
For personal use. Only reproduce with permission from The Lancet Publishing Group.
Dermoscopy<br />
Statistical analysis<br />
Sensitivity and specificity were calculated according to<br />
standard formulae. When individual assessments from<br />
several observers were given in a study, the median values <strong>of</strong><br />
the sensitivity and specificity were used in our analysis.<br />
Least-squares linear regression was used to estimate<br />
parameters for summary receiver-operating-characteristic<br />
(SROC) models. Estimates <strong>of</strong> sensitivity and specificity<br />
were obtained from each study and used to calculate their<br />
log odds ratio (logit), which measures how well the test<br />
discriminates between melanoma and non-melanoma. The<br />
SROC model was obtained by regression <strong>of</strong> the difference,<br />
D, <strong>of</strong> the logits, logit (sensitivity) minus logit (1 minus<br />
specificity), on the sum, S, <strong>of</strong> the logits, logit (sensitivity)<br />
plus logit (1 minus specificity), to test whether the log odds<br />
ratio is associated with the test threshold. 7,8 An inverse<br />
transformation was then used to transform the data back to<br />
the ROC space and to express sensitivity as a function <strong>of</strong> 1<br />
minus specificity. SROC curves were constructed for each<br />
diagnostic method, and differences between them were<br />
compared by use <strong>of</strong> linear regression analysis. To adjust for<br />
covariates we used multiple linear regression analysis.<br />
For the comparison <strong>of</strong> more than two groups, the log<br />
odds ratios were compared by ANOVA, and adjustment for<br />
covariates was done by ANCOVA. The Scheffe test was used<br />
to account for multiple comparisons.<br />
For paired observations, the log odds ratios were<br />
compared by use <strong>of</strong> the paired t test. If studies that were<br />
included in the paired analysis reported the results for<br />
experts and non-experts, only the experts’ readings were<br />
included in the model. The mean difference between the log<br />
odds ratios observed in the paired analysis was used to<br />
calculate the relative improvement achieved with<br />
<strong>dermoscopy</strong>.<br />
Univariate and multivariate regression analyses were<br />
done to assess the variation in diagnostic <strong>accuracy</strong> due to<br />
study characteristics. The regression coefficients give a<br />
measure <strong>of</strong> the difference in diagnostic performance, with<br />
positive coefficients indicating better discriminatory power<br />
and negative coefficients corresponding to lower<br />
discriminatory ability. For multivariate analysis we used a<br />
Review<br />
forward stepwise linear regression analysis. Variables were<br />
entered in the stepwise model if the probability obtained<br />
from the F test was below 0.05 and removed if p was greater<br />
than 0.1.<br />
Statistical analyses used SPSS (version 10.0). All p values<br />
are two-tailed.<br />
Results<br />
Study characteristics<br />
The main characteristics <strong>of</strong> each <strong>of</strong> the 27 eligible<br />
studies 4–6,10–13,17–36 are presented in Table 1. The pooled<br />
sample was 9821 pigmented skin lesions (median per study<br />
232). The prevalence <strong>of</strong> melanoma ranged from 1.6% to<br />
60.8% (mean 28.3%). The mean or median Breslow<br />
thickness was reported in 15 studies and ranged from 0.40<br />
mm to 1.11 mm (median 0.70 mm).<br />
In most <strong>of</strong> the available studies, all lesions were selected<br />
for disease verification. Only one study looked at the<br />
outcome <strong>of</strong> benign lesions that were not selected for<br />
excision. 34<br />
Several studies compared different diagnostic methods<br />
for the diagnosis <strong>of</strong> melanoma. In fourteen studies (52%),<br />
the diagnostic <strong>accuracy</strong> for melanoma with and without<br />
<strong>dermoscopy</strong> was directly compared and in three (11%) two<br />
or more diagnostic algorithms for <strong>dermoscopy</strong> were<br />
compared. Pattern analysis was used in 16 studies (59%),<br />
the ABCD rule in seven (26%), and modified pattern<br />
analysis in conjunction with a scoring system in seven<br />
(26%). Five studies (19%) compared the performance <strong>of</strong><br />
experts and non-experts, and two (7%) assessed the<br />
influence <strong>of</strong> training on the performance <strong>of</strong> non-experts.<br />
All but one study investigated dermatologists; Westerh<strong>of</strong>f<br />
and colleagues studied the effect <strong>of</strong> <strong>dermoscopy</strong> on the<br />
diagnostic performance <strong>of</strong> primary-care physicians. 36<br />
The first model was a paired analysis and included only<br />
those studies that directly compared the diagnostic<br />
<strong>accuracy</strong> for melanoma with and without <strong>dermoscopy</strong><br />
(Table 2). One <strong>of</strong> these 14 studies presented the results in<br />
such a way that the sensitivity and specificity <strong>of</strong> <strong>dermoscopy</strong><br />
could not be calculated, and it was therefore excluded from<br />
the paired analysis. The mean log odds ratio achieved with<br />
Table 2. Main results <strong>of</strong> studies that directly compared the diagnostic <strong>accuracy</strong> for melanoma with and without <strong>dermoscopy</strong><br />
First author Ref Sample size Sensitivity Specificity Log odds ratio<br />
Unaided eye Dermoscopy Unaided eye Dermoscopy Unaided eye Dermoscopy<br />
Benelli 17 401 0.67 0.80 0.79 0.89 2.04 3.49<br />
Binder 6 240 0.58 0.68 0.91 0.91 2.64 3.07<br />
Binder 5 100 0.73 0.73 0.70 0.78 1.84 2.26<br />
Carli 21 15 0.42 0.75 0.78 0.89 0.93 3.17<br />
Crist<strong>of</strong>olini 22 220 0.85 0.88 0.75 0.79 2.83 3.32<br />
Dummer 23 824 0.65 0.96 0.93 0.98 3.21 7.07<br />
Krähn 27 80 0.79 0.90 0.78 0.93 2.59 4.78<br />
Lorentzen 28 232 0.77 0.82 0.89 0.94 3.30 4.27<br />
Nachbar 11 172 0.84 0.93 0.84 0.91 3.29 4.89<br />
Soyer 32 159 0.94 0.94 0.82 0.82 4.27 4.27<br />
Stanganelli 33 20 0.55 0.73 0.79 0.73 1.52 1.94<br />
Stanganelli 34 3329 0.67 0.93 0.99 1.00 5.82 8.25<br />
Westerh<strong>of</strong>f 35 100 0.63 0.76 0.54 0.58 0.66 1.46<br />
THE LANCET Oncology Vol 3 March 2002 http://oncology.thelancet.com 161<br />
For personal use. Only reproduce with permission from The Lancet Publishing Group.
<strong>dermoscopy</strong> was significantly higher than that achieved<br />
without <strong>dermoscopy</strong> (4.0 [95% CI 3.0 to 5.1] versus 2.7 [1.9<br />
to 3.4]), resulting in a mean difference <strong>of</strong> 1.3 (0.7 to 2.0), or<br />
an improvement <strong>of</strong> 49% (p = 0.001).<br />
The second model included the results <strong>of</strong> all 27 eligible<br />
studies and yielded similar results. The mean log odds ratio<br />
achieved with <strong>dermoscopy</strong> was again significantly higher<br />
than that achieved without <strong>dermoscopy</strong> (3.4 [2.9 to 3.9]<br />
versus 2.5 [1.9 to 3.1], p = 0.03). Inclusion <strong>of</strong> information<br />
on the experience <strong>of</strong> the examiners showed that the<br />
diagnostic performance <strong>of</strong> <strong>dermoscopy</strong> was significantly<br />
better for experts than for non-experts (mean log odds ratio<br />
3.8 [3.3 to 4.3] versus 2.0 [1.4 to 2.6]; mean difference 1.8<br />
[0.8 to 2.7], p = 0.001). To account for this finding, we<br />
generated a model that compared the performance <strong>of</strong> the<br />
clinical diagnosis without <strong>dermoscopy</strong>, <strong>dermoscopy</strong> by<br />
non-experts, and <strong>dermoscopy</strong> by experts. For each <strong>of</strong> the<br />
methods, SROC curves were constructed (Figure 2). The<br />
clinical diagnosis without <strong>dermoscopy</strong> showed similar<br />
diagnostic <strong>accuracy</strong> to <strong>dermoscopy</strong> by non-experts (mean<br />
log odds ratio 2.5 versus 2.0; mean difference 0.5 [95% CI<br />
for difference -0.4 to 1.4], p = 0.65). For both approaches<br />
the diagnostic <strong>accuracy</strong> was significantly lower than that<br />
achieved with <strong>dermoscopy</strong> by experts (mean log odds ratio<br />
3.8, p = 0.003 and p = 0.001).<br />
The influence <strong>of</strong> study characteristics on the diagnostic<br />
performance <strong>of</strong> <strong>dermoscopy</strong> was investigated by univariate<br />
and multivariate regression analysis including the results <strong>of</strong><br />
all eligible studies. As in the analysis above, the diagnostic<br />
performance <strong>of</strong> <strong>dermoscopy</strong> increased for experts<br />
(regression coefficient 1.8 [95% CI 0.8 to 2.8], p < 0.001).<br />
The diagnostic performance also increased when the<br />
Sensitivity<br />
162<br />
Review<br />
1.0<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
0.0<br />
Without <strong>dermoscopy</strong><br />
Dermoscopy when performed by experts<br />
Dermoscopy when performed by non-experts<br />
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0<br />
1Specificity<br />
Figure 2. SROC curves for the performance <strong>of</strong> the clinical diagnosis<br />
without <strong>dermoscopy</strong> (red line), <strong>dermoscopy</strong> by experts (black line), and<br />
<strong>dermoscopy</strong> by non-experts (blue line).<br />
Dermoscopy<br />
diagnosis was made by a group <strong>of</strong> two or more examiners in<br />
consensus (regression coefficient 1.1 [0.2 to 2.1], p = 0.02).<br />
Although consensus increased the discriminatory power <strong>of</strong><br />
<strong>dermoscopy</strong>, the procedure performed by experts achieved<br />
higher <strong>accuracy</strong> than inspection with the unaided eye<br />
whether or not the dermoscopic diagnosis was made in<br />
consensus. The <strong>accuracy</strong> <strong>of</strong> the (clinically more relevant)<br />
non-consensus diagnosis achieved with <strong>dermoscopy</strong> was<br />
significantly higher than that achieved without <strong>dermoscopy</strong><br />
(mean log odds ratio 3.7 versus 2.5; mean difference 1.2<br />
[95% CI for difference 0.3 to 2.2], p = 0.01).<br />
The diagnostic ability <strong>of</strong> <strong>dermoscopy</strong> was inversely<br />
correlated with the prevalence <strong>of</strong> melanoma in the sample<br />
(regression coefficient -0.04 [95% CI -0.06 to -0.01],<br />
p = 0.006) and lower for experimental studies that used<br />
presentation <strong>of</strong> slides, colour prints, or digital images than<br />
for clinical studies in which the diagnosis was made face to<br />
face (regression coefficient -1.3 [-2.1 to -0.5], p = 0.001).<br />
Other study characteristics did not significantly influence<br />
the diagnostic performance <strong>of</strong> <strong>dermoscopy</strong>.<br />
For multivariate analysis we used a forward stepwise<br />
regression analysis. The final model included three<br />
variables: the experience <strong>of</strong> examiners (regression<br />
coefficient 1.2 [0.3 to 2.1], p = 0.01), the prevalence <strong>of</strong><br />
melanoma (regression coefficient -0.04 [-0.06 to -0.01],<br />
p = 0.01), and whether the diagnosis was made in consensus<br />
(regression coefficient 1.0 [0.04 to 1.9], p = 0.04). Other<br />
variables were not independently associated with the<br />
diagnostic <strong>accuracy</strong> <strong>of</strong> <strong>dermoscopy</strong>. Since the<br />
dermatologists’ experience was the strongest predictive<br />
variable for the diagnostic performance <strong>of</strong> <strong>dermoscopy</strong>, we<br />
built a SROC model for the pooled diagnostic performance<br />
<strong>of</strong> <strong>dermoscopy</strong> adjusted for three settings with different<br />
degrees <strong>of</strong> experience (Figure 3).<br />
Univariate analysis <strong>of</strong> the individual results <strong>of</strong> all<br />
eligible studies showed that the diagnostic <strong>accuracy</strong> <strong>of</strong><br />
<strong>dermoscopy</strong> was similar for the different diagnostic<br />
algorithms. The log odds ratios achieved with pattern<br />
analysis (3.6 [95% CI 2.8 to 4.4]), the ABCD rule (3.2 [2.4<br />
to 3.9]), and scoring systems (3.1 [2.1 to 4.0]) did not differ<br />
significantly (p = 0.64). We analysed the influence <strong>of</strong> the<br />
experience <strong>of</strong> the examiners on the performance <strong>of</strong> the<br />
diagnostic algorithms. The degree <strong>of</strong> experience had a<br />
significant effect on the diagnostic <strong>accuracy</strong> <strong>of</strong> pattern<br />
analysis (regression coefficient 2.0 [95% CI 0.4 to 3.6],<br />
p = 0.02) and scoring systems (regression coefficient 2.3<br />
[0.5 to 4.1], p = 0.02). By contrast, the degree <strong>of</strong> experience<br />
had no significant effect on the diagnostic <strong>accuracy</strong><br />
achieved with the ABCD rule (regression coefficient 0.8<br />
[-1.1 to 2.7], p = 0.35).<br />
Discussion<br />
This meta-analysis <strong>of</strong> 27 studies provides evidence that<br />
<strong>dermoscopy</strong> gives better diagnostic <strong>accuracy</strong> for melanoma<br />
than clinical inspection without dermatoscopy (ie with the<br />
unaided eye). This conclusion accords with that <strong>of</strong> a<br />
previous review, which included six studies, 37 and another<br />
meta-analysis, which included eight studies. 38 The review<br />
did not provide a quantitative analysis and the other metaanalysis<br />
was restricted to studies that directly compared the<br />
THE LANCET Oncology Vol 3 March 2002 http://oncology.thelancet.com<br />
For personal use. Only reproduce with permission from The Lancet Publishing Group.
Sensitivity<br />
Dermoscopy<br />
1.0<br />
0.9<br />
0.8<br />
0.7<br />
0.6<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
0<br />
Best case<br />
Base case<br />
Worst case<br />
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0<br />
1Specificity<br />
Figure 3. SROC curves for the pooled diagnostic performance <strong>of</strong><br />
<strong>dermoscopy</strong>. The base case (black line) is adjusted to a setting at which<br />
half <strong>of</strong> the examiners are experienced in <strong>dermoscopy</strong> (experts). The best<br />
case (red line) is adjusted to a setting at which all examiners are experts in<br />
<strong>dermoscopy</strong>. The worst case (blue line) is adjusted to a setting at which<br />
all examiners are untrained or less experienced (non-experts).<br />
diagnostic performance with and without <strong>dermoscopy</strong>.<br />
Neither study addressed the influence <strong>of</strong> study<br />
characteristics on the diagnostic performance <strong>of</strong><br />
<strong>dermoscopy</strong>.<br />
According to our analysis, the diagnostic <strong>accuracy</strong> <strong>of</strong><br />
<strong>dermoscopy</strong> significantly depends on the experience <strong>of</strong> the<br />
examiners. Moreover, the diagnostic <strong>accuracy</strong> achieved is<br />
no better with <strong>dermoscopy</strong> applied by non-experts than<br />
with the unaided eye. This finding underlines the<br />
importance <strong>of</strong> training for the application <strong>of</strong> <strong>dermoscopy</strong>. 5,6<br />
The study by Westerh<strong>of</strong>f and colleagues, investigating the<br />
value <strong>of</strong> <strong>dermoscopy</strong> on the diagnostic performance <strong>of</strong><br />
primary-care physicians, deserves further attention. 36 It was<br />
the only study <strong>of</strong> non-dermatologists. Primary-care<br />
physicians were trained to use a simplified diagnostic<br />
scoring system for <strong>dermoscopy</strong>. Their diagnostic<br />
performance before training was only slightly better than<br />
chance. After training, there was a significant improvement<br />
in the diagnosis <strong>of</strong> melanoma by <strong>dermoscopy</strong> versus<br />
inspection with the unaided eye. However, the reported<br />
diagnostic <strong>accuracy</strong> after training was much lower than in<br />
comparable studies involving dermatologists.<br />
We also found that the diagnostic performance <strong>of</strong><br />
<strong>dermoscopy</strong> was improved when the diagnosis was made by<br />
a group <strong>of</strong> examiners in consensus. A consensus diagnosis<br />
might not be practicable in most clinical settings, but it may<br />
be important for telemedical applications. By electronic<br />
transmission <strong>of</strong> digital dermoscopic images,<br />
tele<strong>dermoscopy</strong> potentially involves two or more experts at<br />
Review<br />
geographically distant facilities. However, how a consensus<br />
can be reached for a group <strong>of</strong> examiners working at<br />
geographically distant facilities is unclear. Two studies that<br />
compared face-to-face diagnosis with remote diagnosis<br />
found no differences in the diagnostic performances,<br />
indicating that electronically transmitted dermoscopic<br />
images convey the information necessary for differentiation<br />
between melanoma and non-melanoma. 39,40 Future work<br />
should assess the value <strong>of</strong> a consensus diagnosis for<br />
electronically transmitted dermoscopic images.<br />
The prevalence <strong>of</strong> melanoma was inversely correlated<br />
with the diagnostic <strong>accuracy</strong> <strong>of</strong> <strong>dermoscopy</strong>. A possible<br />
interpretation <strong>of</strong> this finding is that if more melanomas are<br />
included in a sample, the overall diagnostic difficulty <strong>of</strong> the<br />
sample is increased. Another explanation could be<br />
differences in the criteria applied to select the lesions<br />
between the studies.<br />
Since the original reports by Pehamberger, Steiner, and<br />
colleagues, 3,9,35 describing the use <strong>of</strong> pattern analysis for the<br />
dermoscopic assessment <strong>of</strong> pigmented skin lesions, several<br />
diagnostic algorithms have been developed. Pattern analysis<br />
relies on the description <strong>of</strong> several dermoscopic features,<br />
which can be difficult for non-experts to recognise. Scoring<br />
systems are simplified versions <strong>of</strong> pattern analysis with a<br />
limited number <strong>of</strong> dermoscopic features. The ABCD rule is<br />
somewhat different from the other algorithms because the<br />
exact description <strong>of</strong> the dermoscopic features is not so<br />
important. Pattern analysis requires a sufficient amount <strong>of</strong><br />
training, 6 whereas the other, simpler, diagnostic algorithms<br />
might be more suitable for less experienced examiners. 4,12 In<br />
our analysis, all algorithms did equally well. Pattern analysis<br />
showed slightly better diagnostic <strong>accuracy</strong> than the other<br />
algorithms but the differences were not statistically<br />
significant. As expected, the diagnostic performance <strong>of</strong><br />
pattern analysis was strongly influenced by the experience<br />
<strong>of</strong> the examiners. Surprisingly, this was also true for scoring<br />
systems. One explanation might be that, as for pattern<br />
analysis, the recognition <strong>of</strong> dermoscopic features is crucial<br />
for the diagnostic procedure. Compared with pattern<br />
analysis and scoring systems, the degree <strong>of</strong> experience had<br />
less influence on the diagnostic ability <strong>of</strong> the ABCD rule for<br />
<strong>dermoscopy</strong>, which suggests that this algorithm is especially<br />
suitable for beginners in <strong>dermoscopy</strong>.<br />
As shown by the SROC curves in Figure 3, the<br />
diagnostic <strong>accuracy</strong> <strong>of</strong> <strong>dermoscopy</strong> does not reach 100%<br />
even under the assumption <strong>of</strong> optimum conditions,<br />
indicating that <strong>dermoscopy</strong> cannot replace histopathology.<br />
However, <strong>dermoscopy</strong> may provide useful additional<br />
information for the histopathologist in difficult cases. Soyer<br />
and colleagues showed that clinicopathological correlation<br />
<strong>of</strong> pigmented skin lesions by <strong>dermoscopy</strong> is useful for<br />
dermatopathologists when reporting on melanocytic skin<br />
lesions. 41 Dermoscopy and histopathology should be<br />
regarded as concurrent examinations <strong>of</strong> a joint diagnostic<br />
procedure with additive information.<br />
The summary estimates <strong>of</strong> the diagnostic <strong>accuracy</strong> <strong>of</strong><br />
<strong>dermoscopy</strong> provided by this meta-analysis have to be<br />
interpreted with caution because the results <strong>of</strong> most studies<br />
were potentially influenced by verification bias, which is<br />
likely to occur when the decision to proceed with the<br />
THE LANCET Oncology Vol 3 March 2002 http://oncology.thelancet.com 163<br />
For personal use. Only reproduce with permission from The Lancet Publishing Group.
164<br />
Review<br />
Search strategy and selection criteria<br />
Relevant studies were identified and retrieved by a search<br />
<strong>of</strong> MEDLINE for the period January 1987 to December<br />
2000, by manual searches <strong>of</strong> the reference lists <strong>of</strong> retrieved<br />
articles, and by direct communication with experts on this<br />
topic. The terms “epiluminescence”, “<strong>dermoscopy</strong>”,<br />
“dermatoscopy”, and “incident light microscopy” were<br />
linked with a Boolean OR operator and the search yielded<br />
157 articles. 116 articles were excluded at this stage: those<br />
that were not relevant to the topic, did not address the<br />
diagnostic <strong>accuracy</strong> for melanoma, or were published in<br />
languages other than English or German, review articles,<br />
letters, and reports without original data. Additional<br />
articles were identified by manual searches <strong>of</strong> the<br />
reference lists <strong>of</strong> retrieved articles and by direct<br />
communication with experts. Articles that did not include<br />
original data on the diagnostic <strong>accuracy</strong> for melanoma<br />
and those that did not report sufficient data for the<br />
sensitivity and specificity to be estimated were excluded.<br />
Estimates <strong>of</strong> the diagnostic <strong>accuracy</strong> for melanoma<br />
involving computerised image analysis were also excluded<br />
from further analysis. The final sample included 27<br />
studies, <strong>of</strong> which 20 were identified by the MEDLINE<br />
search, three by manual searches <strong>of</strong> the reference lists <strong>of</strong><br />
retrieved articles, and four by communication with<br />
experts.<br />
reference test (histopathology) partly depends on the<br />
results <strong>of</strong> the clinical diagnosis. Suspect clinical findings<br />
are more likely to be investigated by histopathology, so the<br />
chance <strong>of</strong> detecting a true positive is higher than that for a<br />
false negative and the chance for detecting a false positive is<br />
higher than that for a true negative. In this case, sensitivity<br />
seems to be falsely increased and specificity falsely<br />
decreased. Since most <strong>of</strong> the studies included in our meta<br />
analysis were potentially influenced by verification bias, in<br />
general the sensitivity is probably overestimated and the<br />
specificity underestimated.<br />
Another important issue that may have influenced our<br />
results is publication bias. This bias refers to the systematic<br />
error induced in a statistical analysis by the requirement<br />
for studies to be published. The influence <strong>of</strong> publication<br />
bias is difficult to assess. The most important question is<br />
whether our results can be explained solely by its presence.<br />
We think that this is unlikely, because generally<br />
publication bias arises because studies with statistically<br />
significant results are more likely to be published<br />
than those with non-significant results, but only a few<br />
studies included in our analysis provided a direct statistical<br />
comparison <strong>of</strong> the diagnostic <strong>accuracy</strong> with and without<br />
<strong>dermoscopy</strong>. However, publication bias cannot be ruled<br />
out completely and may have influenced the results <strong>of</strong> our<br />
analysis towards an overoptimistic estimate <strong>of</strong> the<br />
diagnostic <strong>accuracy</strong> <strong>of</strong> <strong>dermoscopy</strong>.<br />
Conclusion<br />
Dermoscopy improves the diagnostic <strong>accuracy</strong> for<br />
melanoma in comparison with inspection by the unaided<br />
Dermoscopy<br />
eye. However, <strong>dermoscopy</strong> requires sufficient training and<br />
cannot be recommended for untrained users. A consensus<br />
diagnosis involving two or more experts is recommended to<br />
yield the highest possible diagnostic <strong>accuracy</strong>.<br />
References<br />
1 Grin CM, Kopf AW, Welkovich B, Bart RS, Levenstein MJ.<br />
Accuracy in the clinical diagnosis <strong>of</strong> malignant melanoma. Arch<br />
Dermatol 1990; 126: 763–66.<br />
2 Argenziano G, Soyer HP. Dermoscopy <strong>of</strong> pigmented skin lesions:<br />
a valuable tool for early diagnosis <strong>of</strong> melanoma. Lancet Oncol 2001;<br />
2: 443–49.<br />
3 Pehamberger H, Binder M, Steiner A, Wolff K. In vivo<br />
epiluminescence microscopy: improvement <strong>of</strong> early diagnosis <strong>of</strong><br />
melanoma. J Invest Dermatol 1993; 100: 356S–62S.<br />
4 Binder M, Kittler H, Steiner A, et al. Reevaluation <strong>of</strong> the ABCD<br />
rule for epiluminescence microscopy. J Am Acad Dermatol 1999;<br />
40: 171–76.<br />
5 Binder M, Puespoeck-Schwarz M, Steiner A, et al. Epiluminescence<br />
microscopy <strong>of</strong> small pigmented skin lesions: short-term formal<br />
training improves the diagnostic performance <strong>of</strong> dermatologists.<br />
J Am Acad Dermatol 1997; 36: 197–202.<br />
6 Binder M, Schwarz M, Winkler A, et al. Epiluminescence<br />
microscopy: a useful tool for the diagnosis <strong>of</strong> pigmented skin<br />
lesions for formally trained dermatologists. Arch Dermatol 1995;<br />
131: 286–91.<br />
7 Littenberg B, Moses LE. Estimating diagnostic <strong>accuracy</strong> from<br />
multiple conflicting reports: a new meta-analytic method. Med<br />
Decis Making 1993; 13: 313–21.<br />
8 Moses LE, Shapiro D, Littenberg B. Combining independent<br />
studies <strong>of</strong> a diagnostic test into a summary ROC curve: dataanalytic<br />
approaches and some additional considerations. Stat Med<br />
1993; 12: 1293–316.<br />
9 Pehamberger H, Steiner A, Wolff K. In vivo epiluminescence<br />
microscopy <strong>of</strong> pigmented skin lesions: I, pattern analysis<br />
<strong>of</strong> pigmented skin lesions. J Am Acad Dermatol. 1987; 17:<br />
571–83.<br />
10 Stolz W, Riemann A, Armand B, et al. ABCD rule <strong>of</strong><br />
dermatoscopy: a new practical method for early recognition <strong>of</strong><br />
melanoma. Eur J Dermatol 1994; 4: 521–27.<br />
11 Nachbar F, Stolz W, Merkle T, et al. The ABCD rule <strong>of</strong><br />
dermatoscopy: high prospective value in the diagnosis <strong>of</strong> doubtful<br />
melanocytic skin lesions. J Am Acad Dermatol. 1994; 30: 551–59.<br />
12 Argenziano G, Fabbrocini G, Carli P, et al. Epiluminescence<br />
microscopy for the diagnosis <strong>of</strong> doubtful melanocytic skin lesions:<br />
comparison <strong>of</strong> the ABCD rule <strong>of</strong> dermatoscopy and a new 7-point<br />
checklist based on pattern analysis. Arch Dermatol 1998; 134:<br />
1563–70.<br />
13 Menzies SW, Ingvar C, Crotty KA, McCarthy WH. Frequency and<br />
morphologic characteristics <strong>of</strong> invasive melanomas lacking specific<br />
surface microscopic features. Arch Dermatol 1996; 132: 1178–82.<br />
14 Menzies SW, Crotty KA, McCarthy WH. The morphologic criteria<br />
<strong>of</strong> the pseudopod in surface microscopy. Arch Dermatol 1995; 131:<br />
436–40.<br />
15 Kenet RO, Fitzpatrick TB. Reducing mortality and morbidity <strong>of</strong><br />
cutaneous melanoma: a six year plan. B) Identifying high and low<br />
risk pigmented lesions using epiluminescence microscopy.<br />
J Dermatol 1994; 21: 881–84.<br />
16 Benelli C, Roscetti E, Dal PV. Reproducibility <strong>of</strong> a dermoscopic<br />
method (7FFM) for the diagnosis <strong>of</strong> malignant melanoma. Eur J<br />
Dermatol 2000; 10: 110–14.<br />
17 Benelli C, Roscetti E, Pozzo VD, et al. The dermoscopic versus the<br />
clinical diagnosis <strong>of</strong> melanoma. Eur J Dermatol 1999; 9: 470–76.<br />
18 Dal Pozzo V, Benelli C, Roscetti E. The seven features for<br />
melanoma: a new dermoscopic algorithm for the diagnosis <strong>of</strong><br />
malignant melanoma. Eur J Dermatol 1999; 9: 303–08.<br />
19 Bauer P, Crist<strong>of</strong>olini P, Boi S, et al. Digital epiluminescence<br />
microscopy: usefulness in the differential diagnosis <strong>of</strong> cutaneous<br />
pigmentary lesions: a statistical comparison between visual and<br />
computer inspection. Melanoma Res 2000; 10: 345–49.<br />
20 Binder M, Steiner A, Schwarz M, et al. Application <strong>of</strong> an artificial<br />
neural network in epiluminescence microscopy pattern analysis <strong>of</strong><br />
pigmented skin lesions: a pilot study. Br J Dermatol 1994; 130:<br />
460–65.<br />
21 Carli P, De Giorgi V, Naldi L, Dosi G. Reliability and interobserver<br />
agreement <strong>of</strong> dermoscopic diagnosis <strong>of</strong> melanoma and<br />
melanocytic naevi. Eur J Cancer Prev 1998; 7: 397–402.<br />
THE LANCET Oncology Vol 3 March 2002 http://oncology.thelancet.com<br />
For personal use. Only reproduce with permission from The Lancet Publishing Group.
Dermoscopy<br />
22 Crist<strong>of</strong>olini M, Zumiani G, Bauer P, et al. Dermatoscopy:<br />
usefulness in the differential diagnosis <strong>of</strong> cutaneous pigmentary<br />
lesions. Melanoma Res 1994; 4: 391–94.<br />
23 Dummer W, Doehnel KA, Remy W. Videomicroscopy in<br />
differential diagnosis <strong>of</strong> skin tumors and secondary prevention <strong>of</strong><br />
malignant melanoma. Hautarzt 1993; 44: 772–76.<br />
24 Feldmann R, Fellenz C, Gschnait F. The ABCD rule in<br />
dermatoscopy: analysis <strong>of</strong> 500 melanocytic lesions. Hautarzt 1998;<br />
49: 473–76.<br />
25 Kittler H, Seltenheim M, Pehamberger H, et al. <strong>Diagnostic</strong><br />
informativeness <strong>of</strong> compressed digital epiluminescence microscopy<br />
images <strong>of</strong> pigmented skin lesions compared with photographs.<br />
Melanoma Res 1998; 8: 255–60.<br />
26 Kittler H, Seltenheim M, Dawid M, et al. Morphologic changes <strong>of</strong><br />
pigmented skin lesions: a useful extension <strong>of</strong> the ABCD rule for<br />
dermatoscopy. J Am Acad Dermatol 1999; 40: 558–62.<br />
27 Krahn G, Gottlober P, Sander C, Peter RU. Dermatoscopy and<br />
high frequency sonography: two useful non-invasive methods to<br />
increase preoperative diagnostic <strong>accuracy</strong> in pigmented skin<br />
lesions. Pigment Cell Res 1998; 11: 151–54.<br />
28 Lorentzen H, Weismann K, Petersen CS, et al. Clinical and<br />
dermatoscopic diagnosis <strong>of</strong> malignant melanoma assessed by<br />
expert and non-expert groups. Acta Dermatol Venereol 1999; 79:<br />
301–04.<br />
29 Lorentzen H, Weismann K, Kenet RO, et al. Comparison <strong>of</strong><br />
dermatoscopic ABCD rule and risk stratification in the diagnosis <strong>of</strong><br />
malignant melanoma. Acta Dermatol Venereol 2000; 80: 122–26.<br />
30 Nilles M, Boedeker RH, Schill WB. Surface microscopy <strong>of</strong> naevi<br />
and melanomas: clues to melanoma. Br J Dermatol 1994; 130:<br />
349–55.<br />
31 Seidenari S, Pellacani G, Pepe P. Digital videomicroscopy improves<br />
diagnostic <strong>accuracy</strong> for melanoma. J Am Acad Dermatol 1998; 39:<br />
175–81.<br />
32 Soyer HP, Smolle J, Leitinger G, et al. <strong>Diagnostic</strong> reliability <strong>of</strong><br />
Review<br />
dermoscopic criteria for detecting malignant melanoma.<br />
<strong>Dermatology</strong> 1995; 190: 25–30.<br />
33 Stanganelli I, Serafini M, Cainelli T, et al. Accuracy <strong>of</strong><br />
epiluminescence microscopy among practical dermatologists: a<br />
study from the Emilia-Romagna region <strong>of</strong> Italy. Tumori 1998; 84:<br />
701–05.<br />
34 Stanganelli I, Serafini M, Bucch L. A cancer-registry-assisted<br />
evaluation <strong>of</strong> the <strong>accuracy</strong> <strong>of</strong> digital epiluminescence microscopy<br />
associated with clinical examination <strong>of</strong> pigmented skin lesions.<br />
<strong>Dermatology</strong> 2000; 200: 11–16.<br />
35 Steiner A, Pehamberger H, Wolff K. In vivo epiluminescence<br />
microscopy <strong>of</strong> pigmented skin lesions: II, diagnosis <strong>of</strong> small<br />
pigmented skin lesions and early detection <strong>of</strong> malignant<br />
melanoma. J Am Acad Dermatol 1987; 17: 584–91.<br />
36 Westerh<strong>of</strong>f K, McCarthy WH, Menzies SW. Increase in the<br />
sensitivity for melanoma diagnosis by primary care physicians<br />
using skin surface microscopy. Br J Dermatol 2000; 143: 1016–20.<br />
37 Mayer J. Systematic review <strong>of</strong> the diagnostic <strong>accuracy</strong> <strong>of</strong><br />
dermatoscopy in detecting malignant melanoma. Med J Aust 1997;<br />
167: 206–10.<br />
38 Bafounta ML, Beauchet A, Aegerter P, Saiag P. Is <strong>dermoscopy</strong><br />
(epiluminescence microscopy) useful for the diagnosis <strong>of</strong><br />
melanoma? Results <strong>of</strong> a meta-analysis using techniques adapted to<br />
the evaluation <strong>of</strong> diagnostic tests. Arch Dermatol 2001; 137:<br />
1343–50.<br />
39 Piccolo D, Smolle J, Argenziano G, et al. Tele<strong>dermoscopy</strong>: results<br />
<strong>of</strong> a multicentre study on 43 pigmented skin lesions. J Telemed<br />
Telecare 2000; 6: 132–37.<br />
40 Braun RP, Meier M, Pelloni F, et al. Teledermatoscopy in<br />
Switzerland: a preliminary evaluation. J Am Acad Dermatol 2000;<br />
42: 770–75.<br />
41 Soyer HP, Kenet RO, Wolf IH, et al. Clinicopathological<br />
correlation <strong>of</strong> pigmented skin lesions using <strong>dermoscopy</strong>. Eur J<br />
Dermatol 2000; 10: 22–28.<br />
THE LANCET Oncology Vol 3 March 2002 http://oncology.thelancet.com 165<br />
For personal use. Only reproduce with permission from The Lancet Publishing Group.