hu wissen (pdf) - Exzellenzinitiative - Humboldt-Universität zu Berlin
hu wissen (pdf) - Exzellenzinitiative - Humboldt-Universität zu Berlin
hu wissen (pdf) - Exzellenzinitiative - Humboldt-Universität zu Berlin
Sie wollen auch ein ePaper? Erhöhen Sie die Reichweite Ihrer Titel.
YUMPU macht aus Druck-PDFs automatisch weboptimierte ePaper, die Google liebt.
ev re ev<br />
vprotein<br />
nin<br />
Ulf Leser ist Professor für Wissensmana- Ulf Leser is Professor of Knowledge Managegement<br />
in der Bioinformatik. Er hat Informament in Bioinformatics. He studied computer<br />
tik an der Technischen <strong>Universität</strong> München science at Technische <strong>Universität</strong> Munich and<br />
studiert und arbeitete danach am Max- later worked at the Max Planck Institute for<br />
Planck-Institut für Molekulare Genetik. Sei- Molecular Genetics. He was awarded his PhD<br />
nen Doktortitel hat er an der Technischen at Technische <strong>Universität</strong> <strong>Berlin</strong> in the »Distri-<br />
<strong>Universität</strong> <strong>Berlin</strong> im Graduiertenkolleg buted Information Systems« research trai-<br />
»Verteilte Informationssysteme« erlangt. ning group.<br />
Leser ist seit 2002 Professor an der Hum- Since 2002 Leser has been a professor at<br />
boldt-<strong>Universität</strong> und Mitglied verschiedener <strong>Humboldt</strong>-<strong>Universität</strong> and a member of seve-<br />
number of diff erent spellings and syn- HU-Einrichtungen, darunter sind das Zentrum ral HU institutions, including the Centre for<br />
onyms in literature,« Leser explains. »This für Biophysik und Bioinformatik, das Zentrum Biophysics and Bioinformatics, the Centre for<br />
can be used to rank publications by their<br />
relevance or to fi nd connections with other<br />
für sprachliche Bedeutung und das Zentrum<br />
für Ubiquitäre Informationssysteme.<br />
Linguistic Meaning and the Centre for Ubiquitous<br />
Information Systems.<br />
objects.«<br />
leser@informatik.<strong>hu</strong>-berlin.de<br />
Leser is involved in a number of biomedical<br />
research projects, including Colonet,<br />
a project funded by the Federal Minis-<br />
Tel 030 · 2093-3902<br />
try of Education and Research (BMBF) and hosted by »There are about 80 genes that play an important role in the<br />
Charité-<strong>Universität</strong>smedizin <strong>Berlin</strong>. One of Colonet’s aims is to Colonet project. Our job is not only to fi nd them in publications;<br />
fi nd biomarkers that could make it possible to individually cus- another challenge is to rank them in terms of their potential imtomize<br />
a therapy for colorectal cancer. The choice of therapy in this portance as biomarkers, so that we can distinguish between rele-<br />
context is o� en based on a knowledge of specifi c mutations in key vant and insignifi cant references,« Leser stresses. Another problem<br />
genes, information that is »hidden« in a large number of publica- that the HU researcher and many other bioinformaticians are trytions.<br />
One thing that makes the electronic search diffi cult is the ing to solve is how to generate connections with other »objects«<br />
fact that genes are o� en made up of fi ve to ten individual words. such as proteins during a search. In the <strong>hu</strong>man body there are<br />
Sometimes the same name applies to diff erent objects; sometimes about 2,000 genes that regulate other genes and are also relevant<br />
the same protein has several diff erent names. It’s also diffi cult to in the search for a biomarker. In order to automatically recognize<br />
search across species boundaries, because scientifi c knowledge which genes regulate other genes, it’s necessary to recognize not<br />
about tumours is o� en based on experiments with mice. Even biologists<br />
disagree about whether a distinction should be made between<br />
mice and men in gene searches.<br />
How does text mining work? Computer scientists work on systems<br />
which, for example, make it possible to recognize words from<br />
sentence structures and to »see through« grammatical structures<br />
using methods taken from language processing. This is followed<br />
up by machine learning processes derived from known examples of<br />
general rules which can then be applied to new texts. In the meantime,<br />
the HU researcher and his working group can now fi nd all the<br />
<strong>hu</strong>man genes in PubMed’s 19 million abstracts in a single day.<br />
only the genes themselves in sentences but also their interrelation-<br />
ytocIgM<br />
ytocIgM<br />
ships – a major issue in the global bio-text mining community.<br />
And for Ulf Leser.<br />
M<br />
chrome hrom o e c<br />
MOLEKULARE ONKOLOGIE / MOLECULAR ONCOLOGY<br />
177