21.11.2014 Views

ayout 1 - EMBL Grenoble

ayout 1 - EMBL Grenoble

ayout 1 - EMBL Grenoble

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>EMBL</strong> Research at a Glance 2009<br />

Christoph<br />

Steinbeck<br />

PhD 1995, Rheinische<br />

Friedrich-Wilhelm-Universität,<br />

Bonn.<br />

Postdoctoral research at<br />

Tufts University, Boston and<br />

the Max-Planck-Institute of<br />

Chemical Ecology, Jena,<br />

Germany, 1997-2002.<br />

Habilitation, 2003, Organic<br />

Chemistry, Friedrich-Schiller-<br />

Universität, Jena, Germany,<br />

2003.<br />

Head of Research Group for<br />

Molecular Informatics,<br />

Cologne University<br />

Bioinformatics Center<br />

(CUBIC), 2002-2007.<br />

Lecturer in<br />

Chemoinformatics, University<br />

of Tübingen, 2007.<br />

Team leader at <strong>EMBL</strong>-EBI<br />

since 2008.<br />

Chemoinformatics and metabolism<br />

Previous and current research<br />

The Chemoinformatics and Metabolism team aims to provide the biomedical community with<br />

information on small molecules and their interplay with biological systems. The group develops<br />

methods to decipher, organise and publish the small molecule metabolic content of organisms. We<br />

develop tools to quickly determine the structure of metabolites by stochastic screening of large candidate<br />

spaces and enable the identification of molecules with desired properties. This requires algorithms<br />

for the prediction of spectroscopic and other physicochemical properties of chemical<br />

graphs based on machine learning and other statistical methods.<br />

We are further investigating the extraction of chemical knowledge from the printed literature by<br />

text and graph mining methods, improved dissemination of information in life science publications,<br />

as well as open chemoinformatics workflow systems. Together with an international group<br />

of collaborators we develop the Chemistry Development Kit (CDK), the leading open source library<br />

for structural chemoinformatics as well as the chemoinformatics subsystem of Bioclipse, an<br />

award-winning rich client for chemo- and bioinformatics.<br />

Future projects and goals<br />

ChEBI datasets to aid the human curators. Last but not least, 2009<br />

will reveal the EBI’s solution on how to integrate the chemogenomic<br />

data with existing chemical resources at the institute.<br />

The recently acquired resource of large-scale drug activity data at the EBI creates exciting new opportunities<br />

both on the research and service side (www.ebi.ac.uk/Information/News/<br />

pdf/Press23July08.pdf). Our team has started to create an open source chemical search engine for<br />

the new resource, which will be the first open source chemistry search engine for the widely used<br />

OracleTM Database system. A combination of the new chemogenomics data and the Chemistry<br />

Development Kit will allow us to create open structure-activity models and to assist efforts in wet<br />

lab screening in areas such as library design.<br />

On the service side, ensuring a sustainable growth for the ChEBI database will be the focus of our<br />

attention. The number of marketed and developed drugs in the world drug index alone currently<br />

amounts to more than 80,000 compounds. Assuming only a handful of metabolites are produced<br />

by organisms upon application of these drugs, the task ahead takes shape. Not only does this task<br />

require a larger team for data collection and curation but also research into the automated assembly<br />

and validation of<br />

Computer-Assisted Structure Elucidation uses a structure generation<br />

engine to produce chemical spaces based on boundary conditions such<br />

as the gross formula of the unknown compound, determined for instance<br />

by mass spectrometry. These chemical spaces are then crawled and<br />

candidate structures in them inspected for fitness by comparing predicted<br />

and measured properties such as NMR spectra. Based on calculated<br />

fitness values, a ranking is presented to the user.<br />

Selected references<br />

Kuhn, S. et al. (2008). Building blocks for automated elucidation of<br />

metabolites: Machine learning methods for NMR prediction. BMC<br />

Bioinformatics, 9, 00<br />

Willighagen, E.L. et al. (2007). Userscripts for the life sciences. BMC<br />

Bioinformatics, 8, 87<br />

Han, Y.Q. & Steinbeck, C. (200). Evolutionary-algorithm-based<br />

strategy for computer-assisted structure elucidation. J. Chem. Inf.<br />

Com. Sci., , 89-98<br />

Steinbeck, C. (2001). SENECA: A platform-independent, distributed,<br />

and parallel system for computer-assisted structure elucidation in<br />

organic chemistry. J. Chem. Inf. Com. Sci., 1, 1500-1507<br />

8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!