22.08.2016 Views

Annual Scientific Report 2015

EMBL_EBI_ASR_2015_DigitalEdition

EMBL_EBI_ASR_2015_DigitalEdition

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Cheminformatics and Metabolism<br />

Our team works on methods to decipher, organise and disseminate information<br />

about the metabolism of organisms. We develop and maintain MetaboLights,<br />

a metabolomics reference database and archive, and ChEBI, the database and<br />

ontology of chemical entities of biological interest.<br />

We develop algorithms to: process chemical information,<br />

predict metabolomes based on genomic and other<br />

information, determine the structure of metabolites<br />

by stochastic screening of large candidate spaces, and<br />

enable the identification of molecules with desired<br />

properties. This requires algorithms based on machine<br />

learning and other statistical methods for the prediction<br />

of spectroscopic and other physicochemical properties<br />

for compounds represented in chemical graphs.<br />

Our research is dedicated to the elucidation of<br />

metabolomes, Computer-Assisted Structure Elucidation<br />

(CASE), the reconstruction of metabolic networks,<br />

biomedical and biochemical ontologies and algorithm<br />

development in cheminformatics and bioinformatics.<br />

The chemical diversity of the metabolome and a lack of<br />

accepted reporting standards currently make analysis<br />

challenging and time-consuming. Part of our research<br />

comprises the development and implementation of<br />

methods to analyse spectroscopic data in metabolomics.<br />

Major achievements<br />

ChEBI<br />

We added over 5 200 fully curated entries to the ChEBI<br />

database during <strong>2015</strong> (total number of fully curated<br />

entries, over 47 500). Around 30% of new entries are<br />

from direct data submissions, and the remaining entries<br />

are from a variety of sources. Our manual curation<br />

concentrated on natural products, including compounds<br />

identified in submissions to the MetaboLights database,<br />

together with requests from individual users and<br />

research groups. We also put considerable curation<br />

efforts into the Species table, introduced to store<br />

taxonomy and citation information for natural products,<br />

which now contains over 11 000 entries.<br />

On the ChEBI public website, we enhanced main entity<br />

pages to allow users to view information about an entity<br />

in a wider biological context. All biological reactions<br />

and pathways in which the entity participates displayed<br />

using Rhea and Reactome, respectively. ChEBI became<br />

entirely open source (again) in <strong>2015</strong> when we replaced<br />

the Marvin chemistry sketchpad (used for drawing<br />

chemical structures when performing exact structure,<br />

substructure, and similarity searches) with Ketcher.<br />

From its first release in July 2004, ChEBI used<br />

SourceForge as an issue tracker for logging and tracking<br />

requests for new entries, updates to existing entries, new<br />

feature requests, bugs and other queries. During <strong>2015</strong>,<br />

several press reports emerged regarding problems with<br />

SourceForge, which gave rise to worries about its future<br />

stability. In response to user requests, we migrated the<br />

ChEBI issue tracker to GitHub as it offers advanced<br />

features and flexibility.<br />

MetaboLights<br />

We developed new features to help researchers explore<br />

the reference compounds, studies and associated<br />

organism information in MetaboLights. These included<br />

species search functionality based on model organisms,<br />

free-text search and a browsable ‘tree of life’ with<br />

automatic classification compatible with any taxonomic<br />

identifier present in 89 different taxonomy sources.<br />

We linked over 18 000 compounds in MetaboLights<br />

to ChEBI, enriching compounds with 5355 spectra for<br />

Mass Spectrometry and 2890 for NMR. MetaboLights<br />

exported just over 150 datasets (265 total as of December<br />

<strong>2015</strong>) to MetabolomeXchange, the global EBI Search<br />

and the Omics Discovery Index.<br />

In <strong>2015</strong> we completely redesigned MetaboLights to<br />

accommodate a more flexible submission system,<br />

which now supports incremental submissions.<br />

The status of a study in our curation process is<br />

now clearly visible. We also designed a new online<br />

validation system that displays a detailed status of<br />

our current minimum reporting requirements. We<br />

incorporated new technology frameworks, for example<br />

AngularJS, and built new Web Services. We worked on<br />

integrating online analysis tools, for example adapting<br />

MetaboAnalyst (David Wishart group, University of<br />

Alberta). We incorporated tools into our architecture<br />

using R packages and the EMBL-EBI R-cloud. In<br />

tandem, we developed an authenticated Web Service<br />

that enables improved visualisations and response time,<br />

especially for larger datasets.<br />

Community standards<br />

The COSMOS consortium, an EU-funded endeavour<br />

for the coordination of standards for metabolomics,<br />

concluded its work in September <strong>2015</strong>. The<br />

consortium, led by our team, combined the efforts<br />

113<br />

<strong>2015</strong> EMBL-EBI <strong>Annual</strong> <strong>Scientific</strong> <strong>Report</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!