22.07.2013 Views

Automatic Mapping Clinical Notes to Medical - RMIT University

Automatic Mapping Clinical Notes to Medical - RMIT University

Automatic Mapping Clinical Notes to Medical - RMIT University

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Au<strong>to</strong>matic</strong> <strong>Mapping</strong> <strong>Clinical</strong> <strong>Notes</strong> <strong>to</strong> <strong>Medical</strong> Terminologies<br />

Jon Patrick, Yefeng Wang and Peter Budd<br />

School of Information Technologies<br />

<strong>University</strong> of Sydney<br />

NSW 2006, Australia<br />

{jonpat,ywang1,pbud3427}@it.usyd.edu.au<br />

Abstract<br />

<strong>Au<strong>to</strong>matic</strong> mapping of key concepts from<br />

clinical notes <strong>to</strong> a terminology is an important<br />

task <strong>to</strong> achieve for extraction of<br />

the clinical information locked in clinical<br />

notes and patient reports. The present paper<br />

describes a system that au<strong>to</strong>matically<br />

maps free text in<strong>to</strong> a medical reference<br />

terminology. The algorithm utilises Natural<br />

Language Processing (NLP) techniques<br />

<strong>to</strong> enhance a lexical <strong>to</strong>ken<br />

matcher. In addition, this algorithm is<br />

able <strong>to</strong> identify negative concepts as well<br />

as performing term qualification. The algorithm<br />

has been implemented as a web<br />

based service running at a hospital <strong>to</strong><br />

process real-time data and demonstrated<br />

that it worked within acceptable time<br />

limits and accuracy limits for them. However<br />

broader acceptability of the algorithm<br />

will require comprehensive evaluations.<br />

1 Introduction<br />

<strong>Medical</strong> notes and patient reports provide a<br />

wealth of medical information about disease and<br />

medication effects. However a substantial<br />

amount of clinical data is locked away in nonstandardised<br />

forms of clinical language which<br />

could be usefully mined <strong>to</strong> gain greater understanding<br />

of patient care and the progression of<br />

diseases if standardised. Unlike well written texts,<br />

such as scientific papers and formal medical reports,<br />

which generally conform <strong>to</strong> conventions of<br />

structure and readability, the clinical notes about<br />

patients written by a general practitioners, are in<br />

a less structured and often minimal grammatical<br />

form. As these notes often have little if any for-<br />

mal organisation, it is difficult <strong>to</strong> extract information<br />

systematically. Nowadays there is an increased<br />

interest in the au<strong>to</strong>mated processing of<br />

clinical notes by Natural Language Processing<br />

(NLP) methods which can exploit the underlying<br />

structure inherent in language itself <strong>to</strong> derive<br />

meaningful information (Friedman et al., 1994).<br />

In principle, clinical notes could be recorded<br />

in a coded form such as SNOMED CT<br />

(SNOMED International, 2006) or UMLS<br />

(Lindberg et al., 1993), however, in practice<br />

notes are written and s<strong>to</strong>red in a free text representation.<br />

It is believed that the encoding of<br />

notes will provide better information for document<br />

retrieval and research in<strong>to</strong> clinical practice<br />

(Brown and Sönksen, 2000). The use of standard<br />

terminologies for clinical data representation is<br />

critical. Many clinical information systems enforce<br />

standard semantics by mandating structured<br />

data entry. Transforming findings, diseases,<br />

medication procedures in clinical notes in<strong>to</strong><br />

structured, coded form is essential for clinician<br />

research and decision support system. Using<br />

concepts in domain specific terminology can enhance<br />

retrieval. Therefore, converting free text in<br />

clinical notes <strong>to</strong> terminology is a fundamental<br />

problem in many advanced medical information<br />

systems.<br />

SNOMED CT is the most comprehensive<br />

medical terminology in the world and it has been<br />

adopted by the Australia government <strong>to</strong> encode<br />

clinical disease and patient reports. The doc<strong>to</strong>rs<br />

want a system <strong>to</strong> develop a standard terminology<br />

on SNOMED CT for reporting medical complaints<br />

so that their information is exchangeable<br />

and semantically consistent for other practitioners,<br />

and permit au<strong>to</strong>matic extraction of the contents<br />

of clinical notes <strong>to</strong> compile statistics about<br />

diseases and their treatment. Translate medical<br />

concepts in free text in<strong>to</strong> standard medical terminology<br />

in coded form is a hard problem, and cur-<br />

Proceedings of the 2006 Australasian Language Technology Workshop (ALTW2006), pages 73–80.<br />

73

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!