CISMeF terminology

person.hst.aau.dk

CISMeF terminology

CISMeF

Catalog & Index of Health Resources in

French on the Internet

Prof. SJ. Darmoni, MD, PhD

TIBS, LITIS Lab

Rouen University Hospital & Rouen Medical School,

France

Email: Stefan.Darmoni@chu-rouen.fr

MIE Oslo August 2011


Introduction

§ Quality controlled subject gateways (or portals) were

defined by Koch as Internet services which apply a

comprehensive set of quality measures to support

systematic resource discovery.

§ CISMeF = quality controlled health gateway for

French institutional health resources

ü www.cismef.org

2


Introduction

§ The objective of CISMeF (Catalog and Index of Frenchspeaking

resources) is to assist the health professional & lay

people during the search for electronic information available on

the Internet. CISMeF covers healthcare disciplines and medical

sciences.

§ CISMeF was a project originally initiated by Rouen University

Hospital (RUH).

§ URL: http://www.cismef.org & http://www.chu-rouen.fr/cismef

§ CISMeF began in February 1995

§ Doc’CISMeF in 2000: creation of a generic search tool using

the CISMeF semi-informal ontology

§ URL: http://doccismef.chu-rouen.fr/

Methods of Information in Medicine 2000; Jan;39(1) 30-35

Medical Informatics & The Internet in Medicine 2001; 26(3):165 - 178


CISMeF terminology

§ Two standard tools for organising information:

ü the MeSH (Medical Subject Headings) thesaurus from the US

National Library of Medicine

ü Several metadata element sets

• the Dublin Core metadata format + CISMeF specific fields

• For teaching resources, IEEE 1484 LOM metadata format

11 elements of the LOM Educational category => DC.Education

• For evidence-based medicine resources, CISMeF specific fields: level of

evidence + method to evaluate it

• The HIDDEL metadata set is used to enhance transparency, trust and

quality of health information on the Internet.

§ Do not reinvent the wheel +++ but adapt it

DC-2004, International Conference on Dublin Core and Metadata Applications

Stud Health Technol Inform. 2003;95:707-712


MeSH ‘enhancements’

§ The heterogeneity of Internet health

resources led the CISMeF team to

enhance the MeSH thesaurus with the

introduction of two new concepts

ü resource types (N≈300),

ü metaterms (N≈120),

ü predefined queries (N≈200)

Health Information and Libraries Journal 2004 Dec;21(4):253-61


MeSH ‘enhancements’

§ Improvement of the MeSH thesaurus

itself

ü Add-on of 10,000 French synonyms,

including (ambiguous) acronyms

ü Manual translations of 6,000 definitions

(semi-automatic translation for the rest of

the MeSH soon)

ü French translation of >20,000 MeSH

Supplementary Concepts (SC) & add-on of

6,000 synonyms

6


Strategic revolution in 2005

§ Between 1995 & 2005, mono-terminological world around the

MeSH

§ Since 2005, shift to multi-terminological universe :

ü CCAM, CIM10, SNOMED Int., CIF/CIH, CISP2, DRC

ü Creation of a French Health Multi-Terminological Server (HMTS): ANR,

InterSTIS

ü Multi-Terminological extraction (7th FP EU, PSIP

ü Multi-Terminological Information Retrieval (JFIM 2009)

§ Several health terminologies for the automatic indexing and the

information retrieval in the CISMeF quality-controlled health

portal… and beyond

§ Can be reused in any European language if health terminologies

are available in your language!!! In particular in Norway


Multi terminology

information retrieval

TIBS Information processing in biology and in health

Prof. SJ. Darmoni

Situation in August 2011

Doc’CISMeF search engine

CISMeF

Information

System

32 health terminologies

Multi terminology

automatic indexing

ECMT InfoRoute

European Health

multi-Termino-Ontology

Cross-lingual Portal

EHTOP


Multi-Terminological extraction

§ Collaboration with Vidal company

§ F-MTI & ECMT tools

ü 3 PhDs (A. Névéol, S. Pereira & S. Sakji)

§ Bag of words algorithm, stemming (or lemmatization)

§ Inclusion of health terminologies available in French

ü SNOMED Int, ICD 10, MeSH, MeSH SC, ICDC (included in

UMLS)

ü ATC, CIF (WHO)

ü CCAM, DRC, Orphanet, TUV, CIS, CIP, INN, Brand Names

ü MedDRA, WHO-ART, LOINC (to be included)

ü Recent study on CISMeF corpus de CISMeF: MonoT vs.

MultiT (AMIA 2009) : +7% recall ; -12% precision


Multi-Terminological extraction

§ New concept: automatic affiliation of a subheading to

a MeSH term

§ Manual affiliation of a subheading to a MeSH

Supplementary Concept (Evaluation to perform)

§ Stoilo & Lewenstein distance (PhD Z. Moalla Y2)

§ In the near future MeSH Indexing at the concept level

and not anymore at the descriptor level

ü Interesting fo rare diseases; potential collaboration


IR in CISMeF: currently

§ Only three steps

Step1: Reserved terms (∈CISMeF terminology) OR

document's title

Step2: The CISMeF metadata

Mixing the reserved terms, all fields and adjacency in the

titles (word adjacency: (n-1)*5)

Step 3: Adjacency in the plain texts

Mixing the reserved terms, all fields and adjacency in the

plain texts (word adjacency: (n-1)*10)

11


CISMeF Information Retrieval

§ Since 2005, four levels of indexing in CISMeF

ü Level 1: manuel indexing (e.g. guidelines)

ü Level 2: supervised indexing (e.g. technical report

or teaching document from national medical

societies)

ü Level 3: automatic indexing (e.g. SCPs, teaching

document from one medical school)

ü Level 4: extending the CISMeF corpus => Google

CISMeF (restricted to publishers included in

CISMeF)


CISMeF Information Retrieval

§ Some differences with PubMed

ü Resources automatically indexed included

§ CISMeF resource ranking

ü Analysis of the query

ü MeSH Major (or Title) first (display of score)

• Then, date (as PubMed )

ü Automatic (Title or SubTitle)

ü Minor MeSH


Multi-Terminological Information Retrieval

§ RIMT using the same health terminologies, integrated to

the CISMeF backoffice

ü Operational in Doc’CISMeF since April 2009 (test)

ü Bi-terminological in the PSIP DIP since September 2008

§ Bag of words algorithm, stemming

§ Double context

ü Knowledge (CISMeF) + contextual knowledge

• PhD Saoussen Sakji , dec. 2010 (Tunisia)

ü Electronic Health Record (EHR)

• PhD AD Dirieh-Dibad 4Y, planned March 2012 (Djibouti)


ATC

MeSH

16


Results in the Doc’CISMeF search engine

§ Use of multi-terminology indexing with SNOMED &

MedDRA + MeSH indexing

Multi-terminology information retrieval

Multi-terminology manual

indexing using PTS

17


CISMeF & PTS

§ During 2009, in collaboration with 8 students-engineers from the INSA de

Rouen, and with LERTIM & MONDECA, the CISMeF team has

developed a Multi-Terminology Health Portal (PTS as a French acronym).

§ Since 2007, modelization of a generic model to integrate main

terminologies and ontologies available in French

§ The current health terminologies included in PTS are:

ü MeSH (+ MeSH SC + CISMeF extension), SNOMED Int, CCAM, ICD10 & ICPC2

(InterSTIS project)

ü ATC, ICF, WHO-ART, WHO-ICPS (WHO)

ü DRC, MedDRA, MEDLINEPlus

ü CIS, CIP, CAS, EC, INN, Brand Names, PSIP taxonomy, IUPAC, NCC-MERP (PSIP

Project)

ü Orphanet (rare diseases), LPP & Cladimed (medical devices)

ü TUV (Vidal), FMA, SNOMED CT, LOINC

ü ADICAP, NCIT (to be included)

18


HMTP generic model

19


Health Multi-Terminology Portal (HMTP; PTS)

§ URL: http://pts.chu-rouen.fr/

§ Access for humans and coumputers (Web services)

ü Since September 2010, daily used by CISMeF team to index

manually and automatically Web resources

ü Since January 2011, MeSH is freely available (600 unique users

per working day)

§ Restricted access to the other terminologies (230 registred)

§ Cooperation with BioPortal: Clement Jonquet & Mark

Musen

20


Main figures

May 2010

Terminologies Concepts Synonyms Definitions Relations &

hierarchies

August 2011

25 > 580 000 > 840 000 > 220 000 > 1 200 000

Terminologies Concepts Synonyms Definitions Relations &

hierarchies

32 > 1 100 000 > 2 300 000 > 220 000 > 4 000 000


Future work

§ EHTOP

ü ICD10 in five European languages

ü URL: cispro.chu-rouen.fr/ehtop_site

ü Procedures & medical devices T/O

§ RIDoPI: Information retrieval on EHR

ü Numerical data

ü Temporal data

ü RAVEL (2012-4 ANR TecSan program)

§ Interface Terminologies

§ Multi-lingual search engine (already multi-T/O)

§ Teaching document: http://www.univ-rouen.fr/med/breeze/

SDinserm/index.htm


Many thanks

§ Email: Stefan.Darmoni@chu-rouen.fr


MULTITERMINOLOGY

2005-

P. Massari

Conseil Scientifique, 4 Mai 2007

TIBS Information processing in biology and in health

Implicit

Information

Retrieval

NLP, text mining,

ontology

L. Soualmia

M. Joubert / JF. Gehanno

Multi terminology

information retrieval

Saoussen Sakji

Health Information

Systems

CISMeF

terminology

Encapsulated

MeSH

Metaterms

Resource types

Strategy searches

Metadata

B. Thirion, C. Letord

G. Kerdelhué, J. Piot

LERTIM / INSA / Mondeca

French Infobutton

Contexutal

Knowledge

S. Pereira

Cross lingual

Multi

terminology

Portal

EHTOP

32 T/O

Interstis

B. Dahamna

IM. Kergourlay

T. Lecroq

Prof. SJ. Darmoni

A. Rogozan

Textual

automatic

indexing

NLP, KNN

Categorization

A. Neveol

Laboratoire d’Informatique

Traitement de l’Information

et des Systèmes

EA 4108

Other Medical Terminologies & Dictionaries

UMLF, VUMeF, VODeL, PIH

M. Joubert / CIFRE Vidal

Semantic

Interoperability

Intra and Inter

Terminologies

in Health

T. Merabti

Multi terminology

automatic indexing

Computer-assisted

coding sytem

M. Joubert

S. Pereira

MONOTERMINOLOGY

1995-2005


Introduction (cont.)

§ Three priority axes:

ü evidence based medicine

ü teaching material

ü patient information

§ >81,000 resources included

§ 10,000 unique machines/working day

§ CISMeF team in 2011: N= 14

ü 1.5 medical informaticians

ü 1 chief medical librarian + 2.5 medical librarians

ü 1 computer scientist (one junior lecturer) + 1 medical resident

ü 3 research engineers

ü 2 Postdoc + 3 PhD students

§ Budget ≈ 500 K€/y; 40% RUH

§ 30 grants in the last ten years for CISMeF (TIBS, LITIS)

More magazines by this user
Similar magazines