03.12.2015 Views

bbc 2015

BBC2015_booklet

BBC2015_booklet

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 <strong>2015</strong><br />

Abstract ID: O19<br />

Oral presentation<br />

10th Benelux Bioinformatics Conference <strong>bbc</strong> <strong>2015</strong><br />

O19. A SYSTEMS BIOLOGY COMPENDIUM FOR LEISHMANIA DONOVANI<br />

Bart Cuypers 1,2,3* , Pieter Meysman 1,2 , Manu Vanaerschot 3 , Maya Berg 3 , Malgorzata Domagalska 3 , Jean-Claude<br />

Dujardin 3,4# & Kris Laukens 1,2# .<br />

Advanced Database Research and Modeling (ADReM), University of Antwerp 1 ; Biomedical informatics research center<br />

Antwerpen (biomina) 2 ; Molecular Parasitology Unit, Department of Biomedical Sciences, Institute of Tropical Medicine,<br />

Antwerp 3 ; 4 Department of Biomedical Sciences, University of Antwerp 4 . * bart.cuypers@uantwerpen.be # shared senior<br />

authors<br />

Leishmania donovani is the cause of visceral leishmaniasis in the Indian subcontinent and poses a threat to public health<br />

due to increasing drug resistance. Only little is known about its very peculiar molecular biology and there has been little<br />

‘omics integration effort so far. Here we present an integratory database or ‘omics compendium that contains all<br />

genomics, transcriptomics proteomics and metabolomics experiments that are currently publically available for<br />

Leishmania donovani. Additionally the user interface contains analysis tools for new datasets that uses smart data mining<br />

strategies like frequent itemset mining to link results from different ‘omics layers.<br />

INTRODUCTION<br />

The protozoan parasite Leishmania donovani causes<br />

visceral leishmaniasis (VL), a life threatening disease<br />

which affects 500 000 people each year. With only four<br />

drugs available and rapidly emerging drug resistance,<br />

knowledge about the parasite’s resistance mechanisms is<br />

essential to boost the development of new drugs. However,<br />

only little is known about the gene regulation of<br />

Leishmania and the few findings indicate major<br />

differences to known gene expression systems. Indeed, no<br />

polymerase II promotors have ever been found in<br />

Leishmania 1 . Genes are constitutively transcribed in large<br />

polycistronic units and subsequently spliced into<br />

individual mRNAs (trans-splicing) 1 . A modified thymine,<br />

Base J, marks the end of transcription units and functions<br />

as a stop signal for the RNA polymerase 2 . Gene<br />

expression is then assumed to be regulated at the posttranscriptional<br />

level (mRNA stability, translation<br />

efficiency, epigenetic factors, etc…) but evidence to<br />

support this is scarce 1 . Integration of different ‘omics<br />

could shed light on these gene regulatory mechanisms, but<br />

there has been little integration effort so far.<br />

METHODS<br />

We developed an easy to use tool, able to import and<br />

connect all existing L. donovani –omics experiments.<br />

Genomics, epigenomics, transcriptomics, proteomics,<br />

metabolomics and phenotypic data was collected and<br />

added to a MySQL database compendium, further<br />

complemented with publicly available data. Relations<br />

between different ‘omics layers were explicitly defined<br />

and provided with a level of confidence. Python scripts<br />

were developed to preprocess, analyse and import the data.<br />

To allow comparability between different experiments,<br />

platforms and labs the three integration principles of the<br />

COLOMBOS bacterial expression compendium were<br />

adapted 3 . 1) Use the same data-analysis pipeline for all<br />

data. 2) Work with contrasts to a control condition instead<br />

of expression values. 3) Annotate these contrasts in a<br />

unified and structured manner.<br />

Next to this vast data source a set of integrative dataanalysis<br />

tools was developed based on data mining<br />

strategies. For example: One tool uses frequent itemset<br />

mining algorithms to detect which proteins and<br />

metabolites frequently exhibit the same behaviour under<br />

different conditions. Another tool converts several –omics<br />

layers to a network format that can be opened in<br />

Cytoscape and can thus be the basis for network analysis.<br />

The Django and Twitter Bootstrap frameworks were used<br />

to create a web portal to make the tools accessible to any<br />

Leishmania researcher.<br />

RESULTS & DISCUSSION<br />

Excellent public gene, protein, metabolite annotation<br />

databases for Leishmania and related species are already<br />

available (e.g. TriTrypDB and GeneDB). However, the<br />

strength of our tool is that it links these annotation data to<br />

‘omics experiments that are either provided by the user, or<br />

that are publically available. New experiments can quickly<br />

be preprocessed, analysed and integrated in the database<br />

via its python back end. The compendium is therefore not<br />

only a look-up tool (e.g. under which conditions is this<br />

gene or metabolite upregulated?), but has tools available<br />

to also analyse the user-provided data with intelligent data<br />

mining tools (e.g. which metabolites/genes are typically<br />

upregulated in drug-resistant strains?). These new<br />

experiments provide additional confidence and<br />

information about the biological entities in the database.<br />

Unlike many other databases, the compendium has an<br />

elaborate quality control system. Every result provided by<br />

the tools can be traced back to the experimental data,<br />

which contains the necessary quality control plots to<br />

support the experiment’s validity. Additionally, it contains<br />

all relevant information about the extractions and the<br />

origin of the biological material.<br />

Using the compendium and its tools, we characterized the<br />

development and drug-resistance in a system biology<br />

context of Leishmania donovani. The genomes of more<br />

than 200 strains were examined for associations with<br />

phenotypical features and a subset was linked to<br />

transcriptomics, proteomics and metabolomics results. The<br />

compendium and its scripts were designed to be generic<br />

and can therefore be used for other organisms with only<br />

minor changes.<br />

REFERENCES<br />

1. Donelson, J. (1999) PNAS. 96, 2579–258.<br />

2. Van Luenen, H. G. a M. et al. (2012) Cell. 150, 909–21.<br />

3. Meysman. et al. (2014) Nucleic acids research. 42, D649-<br />

D653.<br />

39

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!