Digital Imaging and Communications in Medicine (DICOM)
Digital Imaging and Communications in Medicine (DICOM)
Digital Imaging and Communications in Medicine (DICOM)
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
11.3 Secur<strong>in</strong>g the Data 253<br />
<strong>DICOM</strong> anonymization software keeps a list of these confidential attributes<br />
(some of them provided <strong>in</strong> part PS3.6 of the st<strong>and</strong>ard, Annex E), <strong>and</strong> removes<br />
them from <strong>DICOM</strong> files. As a result, it produces anonymized <strong>DICOM</strong> files that<br />
still conta<strong>in</strong> the image <strong>and</strong> nonconfidential data sufficient for adequate image<br />
display, but lack any confidential <strong>in</strong>formation. You can freely, publicly, <strong>and</strong><br />
safely distribute anonymized <strong>DICOM</strong> files for any practical reason.<br />
The early implementations of this approach produced a hodgepodge of<br />
<strong>DICOM</strong> anonymizers vary<strong>in</strong>g from simple delete-all-patient-<strong>in</strong>formation programs<br />
to <strong>in</strong>tricate manual <strong>DICOM</strong> editors <strong>in</strong> which the user had total control<br />
over remov<strong>in</strong>g <strong>and</strong> edit<strong>in</strong>g <strong>DICOM</strong> file content (attributes). The latter choice,<br />
however, is clearly impractical; you do not want to manually edit some 500 files<br />
<strong>in</strong> your average MR study, it will take forever. 35 Therefore, the automatic approach<br />
has become the most popular; but it, too, has its own shortcom<strong>in</strong>gs.<br />
The biggest mistake made by many <strong>DICOM</strong> anonymizers is the automated<br />
removal of private fields from the files. Consider, for example, an attribute such<br />
as “Patient ID” (element (0010,0020) <strong>in</strong> the <strong>DICOM</strong> Data Dictionary, see also<br />
5.6.1). This attribute is clearly confidential because it uniquely po<strong>in</strong>ts to the<br />
patient. Moreover, many <strong>DICOM</strong> systems use patient name, social security<br />
number, or date of birth for Patient ID. However, one cannot simply wipe the<br />
Patient ID out of a <strong>DICOM</strong> file. This attribute is <strong>DICOM</strong>-required, <strong>and</strong> its removal<br />
would make the file or <strong>DICOM</strong> object <strong>in</strong>valid. Therefore, the attribute<br />
has to be present, but it needs to be changed <strong>in</strong>to someth<strong>in</strong>g mean<strong>in</strong>gless <strong>and</strong><br />
absolutely unrelated to the orig<strong>in</strong>al ID value.<br />
Let’s say that the orig<strong>in</strong>al Patient ID value was “1234567” <strong>and</strong> our <strong>DICOM</strong><br />
anonymization software automatically replaced it with “wo4_ejF9h”. Mission<br />
accomplished? Not at all! Not only should this replacement hide the orig<strong>in</strong>al<br />
data, but it should also consistently reproduce the result regardless of how <strong>and</strong><br />
when it was done. Comb<strong>in</strong><strong>in</strong>g hidden <strong>and</strong> consistent, as you can guess, becomes<br />
the most <strong>in</strong>tricate part of any anonymization. For example, all entries with the<br />
same 1234567 ID that we might encounter later on (say, 2 years later, when this<br />
patent comes for another exam), or all ID entries <strong>in</strong> a 2000-image CT study for<br />
this patient must be consistently replaced with the same “wo4_ejF9h” str<strong>in</strong>g.<br />
Otherwise, we would break a s<strong>in</strong>gle patient <strong>in</strong>to a mix of unrelated pieces, destroy<strong>in</strong>g<br />
the orig<strong>in</strong>al image <strong>and</strong> data relationship. Thus, our anonymiz<strong>in</strong>g software<br />
should replace the confidential tag value with its mean<strong>in</strong>gless placeholder<br />
<strong>in</strong> a unique way. This already beg<strong>in</strong>s to sound like data encryption.<br />
Furthermore, no two different patients <strong>in</strong> our example should receive the<br />
same modified ID. If we anonymize another patient ID us<strong>in</strong>g the same “wo4_<br />
ejF9h” str<strong>in</strong>g, we would glue two totally unrelated people <strong>in</strong>to a s<strong>in</strong>gle Siamese<br />
tw<strong>in</strong> with all the unpleasant consequences. In the extreme case, if we simply<br />
replace any patient ID with a blank (like many anonymizers do), we would es-<br />
35 Also, remember, that some attributes may depend on the others (see 5.5.6), so you<br />
cannot edit them freely.