Automatic Mapping Clinical Notes to Medical - RMIT University
Automatic Mapping Clinical Notes to Medical - RMIT University
Automatic Mapping Clinical Notes to Medical - RMIT University
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Au<strong>to</strong>matic</strong> <strong>Mapping</strong> <strong>Clinical</strong> <strong>Notes</strong> <strong>to</strong> <strong>Medical</strong> Terminologies<br />
Jon Patrick, Yefeng Wang and Peter Budd<br />
School of Information Technologies<br />
<strong>University</strong> of Sydney<br />
NSW 2006, Australia<br />
{jonpat,ywang1,pbud3427}@it.usyd.edu.au<br />
Abstract<br />
<strong>Au<strong>to</strong>matic</strong> mapping of key concepts from<br />
clinical notes <strong>to</strong> a terminology is an important<br />
task <strong>to</strong> achieve for extraction of<br />
the clinical information locked in clinical<br />
notes and patient reports. The present paper<br />
describes a system that au<strong>to</strong>matically<br />
maps free text in<strong>to</strong> a medical reference<br />
terminology. The algorithm utilises Natural<br />
Language Processing (NLP) techniques<br />
<strong>to</strong> enhance a lexical <strong>to</strong>ken<br />
matcher. In addition, this algorithm is<br />
able <strong>to</strong> identify negative concepts as well<br />
as performing term qualification. The algorithm<br />
has been implemented as a web<br />
based service running at a hospital <strong>to</strong><br />
process real-time data and demonstrated<br />
that it worked within acceptable time<br />
limits and accuracy limits for them. However<br />
broader acceptability of the algorithm<br />
will require comprehensive evaluations.<br />
1 Introduction<br />
<strong>Medical</strong> notes and patient reports provide a<br />
wealth of medical information about disease and<br />
medication effects. However a substantial<br />
amount of clinical data is locked away in nonstandardised<br />
forms of clinical language which<br />
could be usefully mined <strong>to</strong> gain greater understanding<br />
of patient care and the progression of<br />
diseases if standardised. Unlike well written texts,<br />
such as scientific papers and formal medical reports,<br />
which generally conform <strong>to</strong> conventions of<br />
structure and readability, the clinical notes about<br />
patients written by a general practitioners, are in<br />
a less structured and often minimal grammatical<br />
form. As these notes often have little if any for-<br />
mal organisation, it is difficult <strong>to</strong> extract information<br />
systematically. Nowadays there is an increased<br />
interest in the au<strong>to</strong>mated processing of<br />
clinical notes by Natural Language Processing<br />
(NLP) methods which can exploit the underlying<br />
structure inherent in language itself <strong>to</strong> derive<br />
meaningful information (Friedman et al., 1994).<br />
In principle, clinical notes could be recorded<br />
in a coded form such as SNOMED CT<br />
(SNOMED International, 2006) or UMLS<br />
(Lindberg et al., 1993), however, in practice<br />
notes are written and s<strong>to</strong>red in a free text representation.<br />
It is believed that the encoding of<br />
notes will provide better information for document<br />
retrieval and research in<strong>to</strong> clinical practice<br />
(Brown and Sönksen, 2000). The use of standard<br />
terminologies for clinical data representation is<br />
critical. Many clinical information systems enforce<br />
standard semantics by mandating structured<br />
data entry. Transforming findings, diseases,<br />
medication procedures in clinical notes in<strong>to</strong><br />
structured, coded form is essential for clinician<br />
research and decision support system. Using<br />
concepts in domain specific terminology can enhance<br />
retrieval. Therefore, converting free text in<br />
clinical notes <strong>to</strong> terminology is a fundamental<br />
problem in many advanced medical information<br />
systems.<br />
SNOMED CT is the most comprehensive<br />
medical terminology in the world and it has been<br />
adopted by the Australia government <strong>to</strong> encode<br />
clinical disease and patient reports. The doc<strong>to</strong>rs<br />
want a system <strong>to</strong> develop a standard terminology<br />
on SNOMED CT for reporting medical complaints<br />
so that their information is exchangeable<br />
and semantically consistent for other practitioners,<br />
and permit au<strong>to</strong>matic extraction of the contents<br />
of clinical notes <strong>to</strong> compile statistics about<br />
diseases and their treatment. Translate medical<br />
concepts in free text in<strong>to</strong> standard medical terminology<br />
in coded form is a hard problem, and cur-<br />
Proceedings of the 2006 Australasian Language Technology Workshop (ALTW2006), pages 73–80.<br />
73