15.12.2012 Views

Digital Imaging and Communications in Medicine (DICOM)

Digital Imaging and Communications in Medicine (DICOM)

Digital Imaging and Communications in Medicine (DICOM)

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

11.3 Secur<strong>in</strong>g the Data 253<br />

<strong>DICOM</strong> anonymization software keeps a list of these confidential attributes<br />

(some of them provided <strong>in</strong> part PS3.6 of the st<strong>and</strong>ard, Annex E), <strong>and</strong> removes<br />

them from <strong>DICOM</strong> files. As a result, it produces anonymized <strong>DICOM</strong> files that<br />

still conta<strong>in</strong> the image <strong>and</strong> nonconfidential data sufficient for adequate image<br />

display, but lack any confidential <strong>in</strong>formation. You can freely, publicly, <strong>and</strong><br />

safely distribute anonymized <strong>DICOM</strong> files for any practical reason.<br />

The early implementations of this approach produced a hodgepodge of<br />

<strong>DICOM</strong> anonymizers vary<strong>in</strong>g from simple delete-all-patient-<strong>in</strong>formation programs<br />

to <strong>in</strong>tricate manual <strong>DICOM</strong> editors <strong>in</strong> which the user had total control<br />

over remov<strong>in</strong>g <strong>and</strong> edit<strong>in</strong>g <strong>DICOM</strong> file content (attributes). The latter choice,<br />

however, is clearly impractical; you do not want to manually edit some 500 files<br />

<strong>in</strong> your average MR study, it will take forever. 35 Therefore, the automatic approach<br />

has become the most popular; but it, too, has its own shortcom<strong>in</strong>gs.<br />

The biggest mistake made by many <strong>DICOM</strong> anonymizers is the automated<br />

removal of private fields from the files. Consider, for example, an attribute such<br />

as “Patient ID” (element (0010,0020) <strong>in</strong> the <strong>DICOM</strong> Data Dictionary, see also<br />

5.6.1). This attribute is clearly confidential because it uniquely po<strong>in</strong>ts to the<br />

patient. Moreover, many <strong>DICOM</strong> systems use patient name, social security<br />

number, or date of birth for Patient ID. However, one cannot simply wipe the<br />

Patient ID out of a <strong>DICOM</strong> file. This attribute is <strong>DICOM</strong>-required, <strong>and</strong> its removal<br />

would make the file or <strong>DICOM</strong> object <strong>in</strong>valid. Therefore, the attribute<br />

has to be present, but it needs to be changed <strong>in</strong>to someth<strong>in</strong>g mean<strong>in</strong>gless <strong>and</strong><br />

absolutely unrelated to the orig<strong>in</strong>al ID value.<br />

Let’s say that the orig<strong>in</strong>al Patient ID value was “1234567” <strong>and</strong> our <strong>DICOM</strong><br />

anonymization software automatically replaced it with “wo4_ejF9h”. Mission<br />

accomplished? Not at all! Not only should this replacement hide the orig<strong>in</strong>al<br />

data, but it should also consistently reproduce the result regardless of how <strong>and</strong><br />

when it was done. Comb<strong>in</strong><strong>in</strong>g hidden <strong>and</strong> consistent, as you can guess, becomes<br />

the most <strong>in</strong>tricate part of any anonymization. For example, all entries with the<br />

same 1234567 ID that we might encounter later on (say, 2 years later, when this<br />

patent comes for another exam), or all ID entries <strong>in</strong> a 2000-image CT study for<br />

this patient must be consistently replaced with the same “wo4_ejF9h” str<strong>in</strong>g.<br />

Otherwise, we would break a s<strong>in</strong>gle patient <strong>in</strong>to a mix of unrelated pieces, destroy<strong>in</strong>g<br />

the orig<strong>in</strong>al image <strong>and</strong> data relationship. Thus, our anonymiz<strong>in</strong>g software<br />

should replace the confidential tag value with its mean<strong>in</strong>gless placeholder<br />

<strong>in</strong> a unique way. This already beg<strong>in</strong>s to sound like data encryption.<br />

Furthermore, no two different patients <strong>in</strong> our example should receive the<br />

same modified ID. If we anonymize another patient ID us<strong>in</strong>g the same “wo4_<br />

ejF9h” str<strong>in</strong>g, we would glue two totally unrelated people <strong>in</strong>to a s<strong>in</strong>gle Siamese<br />

tw<strong>in</strong> with all the unpleasant consequences. In the extreme case, if we simply<br />

replace any patient ID with a blank (like many anonymizers do), we would es-<br />

35 Also, remember, that some attributes may depend on the others (see 5.5.6), so you<br />

cannot edit them freely.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!