30.01.2013 Views

Swiss Medical Informatics - SGMI

Swiss Medical Informatics - SGMI

Swiss Medical Informatics - SGMI

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Typical recommendationsare useful to prescribe themost<br />

appropriate antibiotic, according to different parameters<br />

such aspatient situation, clinical assessment, but also<br />

costs, benefits, adverse effects, and the risk of resistance<br />

development. As acase study, we started investigating<br />

guidelines for geriatrics from the University Hospitals of<br />

Geneva (HUG). Totransform such verbose documents into<br />

machine-readable data, the guidelines were transformed<br />

intodatabase tuples. The translation from French<br />

to English was performed manually, assisted by Frenchto-English<br />

translation tools (e.g. http://eagl.unige.ch/<br />

EAGLm/), and aSNOMEDcategoriser [10].For some of the<br />

queries several answers were possible –three on average<br />

–asshown in table 1, where severe diverticulitis caused<br />

by enterobacteriaceae can be treated by three different antibiotics:<br />

ceftriaxone, metronidazole and piperacillintazobactam;<br />

each of them is unambiguously associated<br />

with aunique terminological identifier.<br />

Further, two search engines corresponding to different<br />

search models were tested: easyIR (a relevance-driven<br />

searchengine well known for outperforming other search<br />

methods on MEDLINE searchtasks [11])and PubMed (the<br />

NCBI’s Boolean search instrument). In addition, different<br />

combinations of the outputs of the two engines (PubMed<br />

and easyIR) were tested to combine the strengths of both<br />

engines.<br />

All targets werenormalised using standardterminologies.<br />

Three terminologies were tested to normalise the antibiotic<br />

type of target: alist of 70 SNOMED CT terms, corresponding<br />

chiefly to the available antibiotics at the University<br />

Hospitals of Geneva, alist of MeSH terms, including<br />

synonymous terms, corresponding to the UMLS Semantic<br />

Type T195, and alist of 70 MeSH terms, including synonymous<br />

terms, corresponding mostly to the available antibiotics<br />

at the University Hospitals of Geneva, associated<br />

with WHO-ATC identifiers. The disease type of target was<br />

normalised using alist of MeSH terms corresponding to<br />

disease, corresponding tothe following UMLS Semantic<br />

Types T020, T190, T049, T019, T047, T050, T033, T037,<br />

T047, T191, T046 and T184. The pathogen type of target<br />

was normalised using asubsetofthe NEWTtaxonomy corresponding<br />

to the bacterial taxonomy.<br />

Furthermore,tofine-tune thequestion-answering module,<br />

several descriptors, in particular generic ones, needed<br />

to be removed. Thus, infectious diseases or cross-infection<br />

were removed from the descriptor list for the disease<br />

type of target. Finally, specific keywords were used tore-<br />

PROCEEDINGS ANNUAL MEETING 2009<br />

fine the search equation. Thus, we added context-specific<br />

descriptors such as geriatrics, elderly, etc. The impact of<br />

general keywords such asrecommended antibiotic, antibiotherapy,<br />

etc. was also tested.<br />

Results and discussion<br />

The evaluation of our results isdone with TrecEval, a<br />

program developed to evaluate TREC (Text Retrieval Conferences)<br />

results using NIST(US NationalInstitute of Standards<br />

and Technologies) evaluation procedures. Fine-tuning<br />

of the engine was based on the TREC Genomics<br />

competitions: see e.g.[11]. As usual in information retrieval<br />

[12] and factoid question-answering tasks, we focus on precision-oriented<br />

metrics. In particular, the so-called precision<br />

at recall 0, i.e. the precision of the top-returned answer,isused<br />

to evaluate the effectiveness of our approach.<br />

To complement this metrics, which provides the precision<br />

of the system used without user interaction, we also measure<br />

the recall of the system achieved by the top five answers.<br />

Thus, we try to estimate how useful such asystem<br />

would be when used by an expert able to validate the guideline<br />

generator’s ranked output.<br />

The 64 triplets generated manually are used as the gold<br />

standard. Each rule/query concerns a specific disease<br />

caused by aspecific pathogen and isrepresented by atuple<br />

of four columns(table 1). Diseases, pathogens and antibiotics<br />

wereentered for each entry.Optionally, conditions<br />

were also addeddepending on the entries, suchasweight<br />

or age. Evaluation is done with the focus on the precision<br />

of the first retrieved antibiotic, which corresponds to the<br />

top precision, noted P0 in the following.<br />

Three terminological targets have been tested. The best results<br />

are obtained using asubsetofthe MeSHincluding 70<br />

antibioticentities. P0 is of 0.58 for the PubMed engine and<br />

of 0.52 for the easyIR engine. Eachantibioticentity is mentioned<br />

by several terms, allowing more results to be retrieved.<br />

Thus, amoxycillinwith clavulanatepotassiumcan<br />

also be mentioned as amoxicillin-clavulanic acid or augmentin.<br />

Moreover, using alimited number of antibiotic<br />

entities avoids returninggeneral terms, suchasAnti-Bacterial<br />

Agents. Using keywords provides no significant improvement<br />

in top precision compared to the baseline system,<br />

but aslight decline in top precision.<br />

From figure 1weobserve that thetwo engines, which tend<br />

to perform very similarly on average, seem not tobehave<br />

Table 1<br />

Example of manually-generated rules: terminological identifiers are provided in parenthesis. In this example the infection can be treated by five different<br />

antibiotics. The use of ceftriaxone requires the precondition severe.<br />

Pathologies Pathogenic agents Antibiotics Other conditions<br />

Diverticulitis (D004238) Enterobacteriaceae (543) Amoxicillin-potassium clavulanate combination (D019980);<br />

ciprofloxacin (D002939);<br />

metronidazole (D008795)<br />

Diverticulitis (D004238) Enterobacteriaceae Ceftriaxone (D002443); Severe<br />

(543) metronidazole (D008795);<br />

piperacillin-tazobactam combination product (C085143)<br />

<strong>Swiss</strong> <strong>Medical</strong> <strong>Informatics</strong> 2009; n o 67<br />

37

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!