bbc 2015
BBC2015_booklet
BBC2015_booklet
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
BeNeLux Bioinformatics Conference – Antwerp, December 7-8 <strong>2015</strong><br />
Abstract ID: O19<br />
Oral presentation<br />
10th Benelux Bioinformatics Conference <strong>bbc</strong> <strong>2015</strong><br />
O19. A SYSTEMS BIOLOGY COMPENDIUM FOR LEISHMANIA DONOVANI<br />
Bart Cuypers 1,2,3* , Pieter Meysman 1,2 , Manu Vanaerschot 3 , Maya Berg 3 , Malgorzata Domagalska 3 , Jean-Claude<br />
Dujardin 3,4# & Kris Laukens 1,2# .<br />
Advanced Database Research and Modeling (ADReM), University of Antwerp 1 ; Biomedical informatics research center<br />
Antwerpen (biomina) 2 ; Molecular Parasitology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine,<br />
Antwerp 3 ; 4 Department of Biomedical Sciences, University of Antwerp 4 . * bart.cuypers@uantwerpen.be # shared senior<br />
authors<br />
Leishmania donovani is the cause of visceral leishmaniasis in the Indian subcontinent and poses a threat to public health<br />
due to increasing drug resistance. Only little is known about its very peculiar molecular biology and there has been little<br />
‘omics integration effort so far. Here we present an integratory database or ‘omics compendium that contains all<br />
genomics, transcriptomics proteomics and metabolomics experiments that are currently publically available for<br />
Leishmania donovani. Additionally the user interface contains analysis tools for new datasets that uses smart data mining<br />
strategies like frequent itemset mining to link results from different ‘omics layers.<br />
INTRODUCTION<br />
The protozoan parasite Leishmania donovani causes<br />
visceral leishmaniasis (VL), a life threatening disease<br />
which affects 500 000 people each year. With only four<br />
drugs available and rapidly emerging drug resistance,<br />
knowledge about the parasite’s resistance mechanisms is<br />
essential to boost the development of new drugs. However,<br />
only little is known about the gene regulation of<br />
Leishmania and the few findings indicate major<br />
differences to known gene expression systems. Indeed, no<br />
polymerase II promotors have ever been found in<br />
Leishmania 1 . Genes are constitutively transcribed in large<br />
polycistronic units and subsequently spliced into<br />
individual mRNAs (trans-splicing) 1 . A modified thymine,<br />
Base J, marks the end of transcription units and functions<br />
as a stop signal for the RNA polymerase 2 . Gene<br />
expression is then assumed to be regulated at the posttranscriptional<br />
level (mRNA stability, translation<br />
efficiency, epigenetic factors, etc…) but evidence to<br />
support this is scarce 1 . Integration of different ‘omics<br />
could shed light on these gene regulatory mechanisms, but<br />
there has been little integration effort so far.<br />
METHODS<br />
We developed an easy to use tool, able to import and<br />
connect all existing L. donovani –omics experiments.<br />
Genomics, epigenomics, transcriptomics, proteomics,<br />
metabolomics and phenotypic data was collected and<br />
added to a MySQL database compendium, further<br />
complemented with publicly available data. Relations<br />
between different ‘omics layers were explicitly defined<br />
and provided with a level of confidence. Python scripts<br />
were developed to preprocess, analyse and import the data.<br />
To allow comparability between different experiments,<br />
platforms and labs the three integration principles of the<br />
COLOMBOS bacterial expression compendium were<br />
adapted 3 . 1) Use the same data-analysis pipeline for all<br />
data. 2) Work with contrasts to a control condition instead<br />
of expression values. 3) Annotate these contrasts in a<br />
unified and structured manner.<br />
Next to this vast data source a set of integrative dataanalysis<br />
tools was developed based on data mining<br />
strategies. For example: One tool uses frequent itemset<br />
mining algorithms to detect which proteins and<br />
metabolites frequently exhibit the same behaviour under<br />
different conditions. Another tool converts several –omics<br />
layers to a network format that can be opened in<br />
Cytoscape and can thus be the basis for network analysis.<br />
The Django and Twitter Bootstrap frameworks were used<br />
to create a web portal to make the tools accessible to any<br />
Leishmania researcher.<br />
RESULTS & DISCUSSION<br />
Excellent public gene, protein, metabolite annotation<br />
databases for Leishmania and related species are already<br />
available (e.g. TriTrypDB and GeneDB). However, the<br />
strength of our tool is that it links these annotation data to<br />
‘omics experiments that are either provided by the user, or<br />
that are publically available. New experiments can quickly<br />
be preprocessed, analysed and integrated in the database<br />
via its python back end. The compendium is therefore not<br />
only a look-up tool (e.g. under which conditions is this<br />
gene or metabolite upregulated?), but has tools available<br />
to also analyse the user-provided data with intelligent data<br />
mining tools (e.g. which metabolites/genes are typically<br />
upregulated in drug-resistant strains?). These new<br />
experiments provide additional confidence and<br />
information about the biological entities in the database.<br />
Unlike many other databases, the compendium has an<br />
elaborate quality control system. Every result provided by<br />
the tools can be traced back to the experimental data,<br />
which contains the necessary quality control plots to<br />
support the experiment’s validity. Additionally, it contains<br />
all relevant information about the extractions and the<br />
origin of the biological material.<br />
Using the compendium and its tools, we characterized the<br />
development and drug-resistance in a system biology<br />
context of Leishmania donovani. The genomes of more<br />
than 200 strains were examined for associations with<br />
phenotypical features and a subset was linked to<br />
transcriptomics, proteomics and metabolomics results. The<br />
compendium and its scripts were designed to be generic<br />
and can therefore be used for other organisms with only<br />
minor changes.<br />
REFERENCES<br />
1. Donelson, J. (1999) PNAS. 96, 2579–258.<br />
2. Van Luenen, H. G. a M. et al. (2012) Cell. 150, 909–21.<br />
3. Meysman. et al. (2014) Nucleic acids research. 42, D649-<br />
D653.<br />
39