You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>EMBL</strong> Research at a Glance 2009<br />
Dietrich<br />
Rebholz-<br />
Schuhmann<br />
Master in Medicine, 1988,<br />
University of Düsseldorf.<br />
PhD 1989, University of<br />
Düsseldorf.<br />
Master in Computer Science,<br />
1993, Passau.<br />
Senior scientist at gsf,<br />
Munich, Germany and LION<br />
bioscience AG, Heidelberg.<br />
Facts from the literature and biomedical<br />
semantics<br />
Previous and current research<br />
Text mining comprises the fast retrieval of relevant documents from the whole body of the literature<br />
(e.g. Medline database) and the extraction of facts from the text thereafter. Text mining solutions<br />
are now becoming mature enough to be automatically integrated into workflows for<br />
research work.<br />
Research in the Rebholz-Schuhmann group is focussed on fact extraction from the literature. It is<br />
our goal to automatically connect literature content to other biomedical data resources (e.g. bioinformatics<br />
databases) and to evaluate the results. Ongoing research targets the identification of relationships<br />
between genes and diseases, molecular interactions and other types of information.<br />
Over the past two years, the team has generated several public resources: a lexicon of biomedical<br />
terms, an ontology for gene regulatory events and recently an authoring service (PaperMaker).<br />
The work in the research group is split into different parts: 1) research work in named entity recognition<br />
and its quality control (e.g. UKPMC project, CALBC); 2) knowledge discovery tasks, e.g.<br />
for the identification of gene–disease associations; and 3) development of a modular IT infrastructure<br />
for information extraction (Whatizit). All parts are tightly coupled.<br />
Future projects and goals<br />
Group leader at <strong>EMBL</strong>-EBI<br />
since 2003.<br />
The following goals are priorities for the future. Firstly we will continue our ongoing research in<br />
term recognition and mapping to biomedical data resources to establish state-of-the-art text mining<br />
applications. We will develop this by focussing on automatic means to measure and evaluate<br />
existing options to identify the most promising solutions (UKPMC project, CALBC support action).<br />
Secondly, we will invest further effort into the extraction of content from the scientific literature. Such solutions will be geared towards the<br />
annotation of diseases and the generation of fact databases. As part of this research we will investigate workflow systems where text mining<br />
supports bioinformatics information retrieval solutions. One solution is the integration of public biomedical data resources into the data<br />
from the biomedical scientific literature.<br />
Finally, we will increase the availability of information extraction solutions based on SOAP web services for the benefit of the bioinformatics<br />
community. This requires standards in the annotation of scientific literature and will automatically lead to semantic enrichment of the scientific<br />
literature.<br />
Overview of the categorisation of information retrieval<br />
tools on the basis of their input and output formats.<br />
Selected references<br />
Beisswanger, E. et al. (2008). Gene Regulation Ontology (Gro):<br />
Design principles and use cases. Paper presented at Studies in<br />
Health Technology and Informatics 2008, 136, 9-1<br />
Jaeger, S. et al. (2008). Integrating protein-protein interactions and<br />
text mining for protein function prediction. BMC Bioinformatics, 9,<br />
Article S2<br />
Kim, J.J. et al. (2008). MedEvi: Retrieving textual evidence of<br />
relations between biomedical concepts from Medline. Bioinformatics,<br />
2, 110-112<br />
Kim, J.J. & Rebholz-Schuhmann, D. (2008). Categorization of<br />
services for seeking information in biomedical literature: A typology<br />
for improvement of practice. Brief. Bioinform., 9, 52-65<br />
Rebholz-Schuhmann, D. et al. (2008). Text processing through web<br />
services: Calling Whatizit. Bioinformatics, 2, 296-298<br />
70