bbc 2015

Recommendations

Info

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P14. NOVOPLASTY: IN SILICO ASSEMBLY OF PLASTID GENOMES FROM WHOLE GENOME NGS DATA Nicolas Dierckxsens 1,2* , Olivier Hardy 2 , Ludwig Triest 3 , Patrick Mardulyn 2 & Guillaume Smits 1,4 . Interuniversity Institute of Bioinformatics Brussels (IB2), ULB-VUB, Triomflaan CP 263, 1050 Brussels, Belgium 1 ; Evolutionary Biology and Ecology Unit, CP 160/12, Faculté des Sciences, Université Libre de Bruxelles, Av. F. D. Roosevelt 50, B-1050 Brussels, Belgium 2 ; Plant Biology and Nature Management, Vrije Universiteit Brussel, Brussels, Belgium 3 ; Department of Paediatrics, Hôpital Universitaire des Enfants Reine Fabiola (HUDERF), Université Libre de Bruxelles (ULB), Brussels, Belgium 4 . * nicolasdierckxsens@hotmail.com Thanks to the evolution in next-generation sequencer (NGS) technology, whole genome data can be readily obtained from a variety of samples. There are many algorithms available to assemble these reads, but few of them focus on assembling the plastid genomes. Therefore we developed a new algorithm that solely assembles the plastid genomes from whole genome data, starting from a single seed. The algorithm is capable of utilizing the full advantage of very high coverage, which makes it even capable of assembling through problematic regions (AT-rich). The algorithm has been tested on several whole genome Illumina datasets and it outperformed other assemblers in runtime and specificity. Every assembly resulted in a single contig for any chloroplast or mitochondrial genome and this always within a timeframe of 30 minutes. INTRODUCTION Chloroplasts and mitochondria are both responsible for generating metabolic energy within eukaryotic cells. Both plastids are maternally inherited and have a persistent gene organization, what makes them ideal for phylogenetic studies or as a barcode in plant and food identification (Brozynska et al., 2014). But assembling these plastids genomes is not always that straightforward with the currently available tools. Therefore we developed a new algorithm, specifically for the assembly of plastid genomes from whole genome data. METHODS The algorithm is written in Perl. All assemblies were executed on Intel Xeon CPU machine containing 24 cores of 2.93 GHz with a total of 96,8 GB of RAM. All nonhuman samples were sequenced on the Illumina HiSeq platform (101 bp paired-end reads). The human mitochondria samples (PCR-free) were sequenced on the Illumina HiSeqX platform (150 bp paired-end reads). The Gonioctena intermedia sample was also sequenced on the PacBio platform. RESULTS & DISCUSSION Algorithm. The algorithm is similar to string overlap algorithms like SSAKE (Warren et al., 2007) and VCAKE (Jeck et al., 2007). It starts with reading the sequences into a hash table, which facilitates a quick accessibility. The assembly has to be initiated by a seed that will be extended bidirectionally in iterations. The seed input is quite flexible, it can be one sequence read, a conserved gene or even a complete mitochondrial genome from a distant species. Every base extension is determined by a consensus between the overlapping reads. Unlike most assemblers, NOVOPlasty doesn’t try to assemble every read, but will extend the given seed until the circular plastid is formed. Assemblies. NOVOPlasty has currently been tested for the assembly of 8 chloroplasts and 6 mitochondria. Since chloroplasts contain an inverted repeat, two versions of the assembly are generated. The differ only in the orientation of the region between the two repeats; the correct one will have to be resolved manually. Besides the mitochondrion of the leaf beetle Gonioctena intermedia, all assemblies resulted in a complete circular genome. A comparative study of four assemblers for the mitochondrial genome of G. intermedia clearly shows the speed and specificity of NOVOPlasty (Table 1). NOVO Plasty MIRA MITO bim ARC Duration (min) 12 536 4777* 586 Memory (GB) 15 57,6 63,4 1,9 Storage (GB) 0 144 418 12 Total contigs 1 3434 2221 2502 Mitochondrial contigs 1 1 4 48 Coverage (%) 98 94 94 84 Mismatches 10 25 26 2 Unidentified nucleotides 43 194 197 0 TABLE 1. Benchmarking results between four assemblies of the mitochondrial genome of Gonioctena intermedia. The assemblies were constructed with MITObim (Hahn et al., 2013), MIRA (Chevreux et al., 1999), ARC (Hunter et al., 2015) and NOVOPlasty.*manually terminated Discussion. Despite the many available assemblers, many researchers still struggle to find a good assembler for plastids genomes. NOVOPlasty offers an assembler specifically designed for plastids that will deliver the complete genome within 30 minutes. The algorithm will be tested on more datasets and a comparative study with other assemblers is in progress. REFERENCES Brozynska et al. PLoS One 9 (2014). Chevreux et al. Computer Science and Biology: Proceedings of the German Conference on Bioinformatics (GCB) (1999). Hahn et al. Nucleic Acids Research, 1-9 (2013). Hunter et al. http://dx.doi.org/10.1101/014662 (2015). Jeck et al. BMC Bioinformatics 23, 2942-2944 (2007). Warren et al. BMC Bioinformatics 23, 500-501 (2007). 58
BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P15. ENANOMAPPER - ONTOLOGY, DATABASE AND TOOLS FOR NANOMATERIAL SAFETY EVALUATION Friederike Ehrhart 1 , Linda Rieswijk 1 , Chris T. Evelo 1 , Haralambos Sarimveis 2 , Philip Doganis 2 , Georgios Drakakis 2 , Bengt Fadeel 3 , Barry Hardy 4 , Janna Hastings 5 , Christoph Helma 6 , Nina Jeliazkova 7 , Vedrin Jeliazkov 7 , Pekka Kohonen 89 , Roland Grafström 9 , Pantelis Sopasakis 10 , Georgia Tsiliki 2 & Egon Willighagen 1 . Department of Bioinformatics - BiGCaT, Maastricht University 1 ; National Technical University of Athens 2 ; Karolinska Institutet 3 ; Douglas Connect 4 ; European Molecular Biology Laboratory – European Bioinformatics Institute 5 ; In silico toxicology 6 ; Ideaconsult Ltd. 7 ; VTT Technical Research Centre of Finland 8 ; Misvik Biology 9 ; IMT Institute for Advanced Studies 10 . *friederike.ehrhart@maastrichtuniversity.nl eNanoMapper is an open computational infrastructure for engineered nanomaterial data: it comprises a semantic web supported database, ontology, and user applications for up- and download of experimental data, and tools for modelling. INTRODUCTION Nanomaterials are defined by size: between 1 nm and 100 nm in at least one dimension. The properties of these material do not always resemble those of the bulk material, i.e. micro- and bigger particles, or solutions. Nanomaterials can differ in reactivity, toxicity in biological organisms and ecosystems depending on their size and surface properties and the possibility for “leakage” of the material it is made off. That is why it is so difficult to assess the safety of nanomaterials and why the NanoSafety Cluster defined a need for a new computational infrastructure in 2012. eNanoMapper is a European project with partners from eight European countries. This project has been developing an computational infrastructure consisting of a semantic web assisted database, a modular ontology, and tools to use them for nanomaterial safety assessment. Data sharing, data storage, data analysis tools, and web services are currently under development, being developed and tested, and put into production use. The project website can be found at www.enanomapper.net. PROBLEM The eNanoMapper platform is designed to support hosting of data on nanomaterial properties relevant for nanosafety assessment as found in existing databases like the NanoMaterial Registry, DaNa Knowledge Base, Nanoparticle Information Library NIL, Nanomaterial- Biological Interactions Knowledgebase, caNanoLab, InterNano, Nano-EHS Database Analysis Tool, nanoHUB, etc. Each of them has different data formats and descriptors, like CODATA-VAMAS’ Universal Description System, ISO-Tab(-Nano), OECD templates, custom spreadsheets, and images. Interoperability is a main aim and semi-automatic import or upload of information and to integrate it in the eNanoMapper data structure is being enabled. Vice versa, retrieval or download of experimental data from the database for (re- )analysis should be provided too, using programmable interfaces to the data and the ontology. Database and search functionality should be semantic web compatible: the project developed and maintain a nanosafety ontology to support this. This eNanoMapper ontology was developed using the Web Ontology Language and the challenge is to map nanomaterial terms to their multiple ontology terms, namely physico-chemical properties, biological and ecological impact, experimental assay description, and known safety aspects. RESULTS & DISCUSSION The current eNanoMapper demo database instance, available at https://data.enanomapper.net/, contains the physico-chemical, biologic and environmental properties of nanomaterials of 465 different nanomaterials 1 . Loading data into the database supports various formats, including the OECD Harmonized Templates and the data structure used by the NanoWiki 2 . A web interface is designed to support all interactions with the database you may want to perform, including uploading of experimental data, as well as querying data to support analysis and modelling of nanoparticle properties. The eNanoMapper ontology is available under http://purl.enanomapper.net/onto/enanomapper.owl and is based on a multi-faceted description of nanoparticles concerning nanoparticle types, physico-chemical description, life cycle, biological and environmental characterisation including experimental methods and protocols, and safety information 3 . The terms are verified against the definitions of REACH, ISO, or common practices used in science in general. The often confused different meanings of endpoints and assays were discriminated in the definitions, e.g. size and size measurement assay. It was partly possible to use existing ontologies as basis, e.g. NPO, ChEBI, GO, etc. but many terms had to be added manually. Currently, there are 4592 classes defined. Users can get access and download the ontology from the U.S. National Center for BioMedical Ontologies BioPortal platform, http://bioportal.bioontology.org/ontologies/ENM. REFERENCES 1 Jeliazkova, N. et al. The eNanoMapper database for nanomaterial safety information. Beilstein Journal of Nanotechnology 6, 1609-1634, doi:10.3762/bjnano.6.165 (2015). 2 Willighagen, E.; doi: org/10.6084/m9.figshare.1330208 3 Hastings, J. et al. eNanoMapper: harnessing ontologies to enable data integration for nanomaterial risk assessment. J Biomed Semantics 6, 10, doi:10.1186/s13326-015-0005-5 (2015). 59
Page 1 and 2:
10 th Benelux Bioinformatics Confer
Page 3 and 4:
10th Benelux Bioinformatics Confere
Page 5 and 6:
Page 7 and 8: 10th Benelux Bioinformatics Confere
Page 19 and 20: BeNeLux Bioinformatics Conference -
Page 57: BeNeLux Bioinformatics Conference -
Page 109 and 110:
BeNeLux Bioinformatics Conference -
Page 111 and 112:
Page 113 and 114:
Page 115:
show all

bbc 2015

Create successful ePaper yourself

Delete template?

Save as template?