bbc 2015

Recommendations

Info

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P6. ENTEROCOCCUS FAECIUM GENOME DYNAMICS DURING LONG-TERM PATIENT GUT COLONIZATION Jumamurat R. Bayjanov 1* , Jery Baan 1 , Mark de Been 1 , Mick Watson 2 & Willem van Schaik 1 . Department of Medical Microbiology, University Medical Center Utrecht, Utrecht, The Netherlands 1 ; Edinburgh Genomics, The University of Edinburgh, Edinburgh, Scotland 2 . * J.Bayjanov@umcutrecht.nl Enterococcus faecium – recently evolved multi-drug resistant nosocomial pathogen – is able to rapidly colonize human gut. Previous work on animal, healthy human and clinical E. faecium strains has shown that clinical isolates form a distinct lineage. However, these studies lack detailed niche-specific and longitudinal evolutionary dynamics analysis of this organism. Here we show longitudinal within-host evolutionary dynamics analysis of E. faecium gut isolates, which were sampled from five patients over the period of 8 years. Whole-genome sequencing analysis showed that rapid diversification of E. faecium clones in patient gut is mainly due to recombinations and phages. High diversification allows E. faecium clones to acquire new genes including antibiotic resistance genes, which allows this bacterium to rapidly colonize hostile environments. INTRODUCTION In recent decades, Enterococcus faecium, normally a harmless gut commensal, has emerged as an important multi-drug resistant nosocomial pathogen. Previous work has shown that clinical isolates of E. faecium form a subpopulation that is distinct from strains isolated from animals and healthy humans (Lebreton et al., 2013). We used whole-genome sequencing to characterize how clinical E. faecium strains evolve during long-term patient gut colonization. METHODS The genomes of 96 E. faecium gut isolates, obtained over 8 years from 5 different patients, were sequenced using Illumina HiSeq 2x100bp paired-end sequencing. Quality filtering of sequence reads was performed using Nesoni (version 0.117) (Nesoni, 2014) and high-quality reads were assembled into contiguous sequences using Spades assembler (version 3.1.0) (Bankevich et al., 2012). Subsequently, assembled sequences were annotated using Prokka (v 1.10) (Seeman T, 2014). In addition to these 96 genomes, we also included publicly available genome sequences of 70 E. faecium strains, which were downloaded from NCBI Genbank database. In the set of 166 strains, orthology between genes were identified using orthAgogue (Ekseth et al., 2014) and orthologous genes were clustered into ortholog groups using MCL algorithm (Enright et al., 2002). Core genome alignments were then constructed by concatenating core gene sequences and were filtered for recombinations using Gubbins (Croucher et al., 2015). Subsequently, recombination-filtered core genome alignments were used to construct a phylogenetic tree. In addition to core-genome based analyses, we have also studied gene gain and loss across time. RESULTS & DISCUSSION As expected all of 96 isolates were grouped in E. faecium clade A, with only one strain clustering in clade A-2, which mainly contains animal isolates. The remaining 95 strains were assigned to clade A-1, which is almost exclusively comprised of clinical isolates. The phylogenetic tree showed 5 clusters of closely related strains of patients, revealing the microevolution of E. faecium strains during gut colonization. We also anticipate that direct transfer of strains had occurred between patients during hospitalization in the same ward. Additionally, analysis of gene gain and loss across time showed that loss and gain of prophages is an important factor in generating genetic diversity during gut colonization. This study highlights the ability of E. faecium clones to rapidly diversify, which may contribute to the ability of this bacterium to efficiently colonize new environments and rapidly acquire antibiotic resistance determinants. REFERENCES Lebreton F, et. al. “Emergence of epidemic multidrug-resistant Enterococcus faecium from animal and commensal strains”. MBio. 4(4):e00534-13, 2013. Nesoni. https://github.com/Victorian-Bioinformatics-Consortium/nesoni Bankevich A, et. al. "SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing". Journal of Computational Biology 19(5):455-477, 2012 Seemann T. "Prokka: rapid prokaryotic genome annotation". Bioinformatics. 30(14):2068-9, 2014. Ekseth OK, et. al. "orthAgogue: an agile tool for the rapid prediction of orthology relations". Bioinformatics. 30(5):734-6, 2014. Enright AJ, et. al. "An efficient algorithm for large-scale detection of protein families". Nucleic Acids Res. 40:1575-1584, 2002. Croucher NJ, et. al. "Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins". Nucleic Acids Res. 43(3):e15, 2015. 50
BeNeLux Bioinformatics Conference – Antwerp, December 7-8 2015 Abstract ID: P Poster 10th Benelux Bioinformatics Conference bbc 2015 P7. XCMS OPTIMISATION IN HIGH-THROUGHPUT LC-MS QC Charlie Beirnaert 1,2* , Matthias Cuykx 3 , Adrian Covaci 3 & Kris Laukens 1,2 . Advanced Database Research and Modeling (ADReM), University of Antwerp 1 ; Biomedical Informatics Research Centre Antwerp (biomina) 2 ; Toxicological Centre, University of Antwerp 3 . * charlie.beirnaert@uantwerpen.be In high-throughput untargeted metabolomics studies, quality control is still a prominent bottleneck. In analogy to a recently developed QC tool for proteomics, work in our research group aims to develop a QC environment specific for metabolomics. One component in this work is the XCMS analysis software for LC-MS data, which is very inputparameter-sensitive. The presented work deals with the automatic optimisation of the XCMS parameters by building further upon an existing framework for XCMS optimisation. The additions to this framework will be the inclusion of quantified resolution data by using the otherwise ignored profile-data and intelligent use of the isotopic profile of measured compounds. INTRODUCTION Metabolomics is the study of small molecules or metabolites. These metabolites have an enormous chemical diversity and are only now starting to be identified in a high-throughput fashion. Reason for this is the adoption of high performance liquid chromatography mass spectrometry and nuclear magnetic resonance spectroscopy. However, the data analysis of these large datasets is not trivial, specifically for LC-MS there are almost more ways of analysing data than there are researchers. Arguably, the most common used software platform for the initial analysis is XCMS (Smith et al., 2006). However, the output of XCMS is very dependent on the input-parameters. Often the default parameters are chosen or they are adapted to the intuition of the researcher, with no account of the introduction of false positives etc. Optimization algorithms have been constructed by using a dilution series (Eliasson et al., 2012) and by using the carbon isotope (Libiseller et al., 2015). In this work, we build further upon the latter by including quantified information from the profile m/z domain (the continuous data in the m/z dimension) where accurate resolutions can be obtained for the mono-isotopic peaks and other isotopes. The developed optimisation can be used for both the data analysis and the quality control framework that is under development. METHODS The proposed work uses XCMS to find the peaks of interest in the data. To optimise this process, the results from XCMS are analysed for the occurrence of peaks and their isotopes. In this step, the raw profile data is inspected around the, by XCMS, identified peaks for the quantification of the peak resolution and for the occurrence of missed isotopes. Centroid vs Profile data: Modern day MS specialists use centroid data because the file size is considerably lower. The mass spectrometer converts the continuous data in the m/z dimension to a collection of spikes where each approximately Gaussian peak is converted to a single spike (delta function with the same height as the original peak). All other data is discarded. The result is a huge reduction in the file size but a loss of the peak shape and, as a result, no quantification of the resolution is possible. Optimization parameter: The peaks and their isotopes are characterized by a Gaussian in the chromatographic dimension and spaced apart by 1.0063 Da in the m/z dimension. When an isotope is missing or the extracted peak does not appear in enough samples (for example in 50% of the samples in the sample group), the peak is categorized as “unreliable”. When a peak is present in all samples or has a clear isotopic distribution it is considered as “reliable”. With these measures a so called peak picking score can be calculated, which in turn can be optimised by a variety of methods. This results in an increase in reliable peaks, while not increasing false positives. Analysis & Quality control: The optimisation of the XCMs parameters is useful both in the analysis of the data itself, but it is also applicable in quality control for large scale LC-MS experiments. By being able to quantify the resolutions of all relevant peaks in a dataset corresponding to a control sample, it is possible to monitor the quality of spectra, and when combining this with other QC frameworks, like iMonDB (Bittremieux et al., 2015) it is possible to assure the quality of all experiments in a long lasting study. RESULTS & DISCUSSION The aim is to use the profile data to improve the available optimization algorithms available. It remains to be seen whether the extra information in this data (compared to centroid data) justifies the increased need of computer resources. Nonetheless, profile data provides a valuable contribution in LC-MS optimization, because it enables researchers to evaluate (quantitatively) and improve the m/z resolution. REFERENCES Smith CA et al. Anal. Chem. 78(3), 779-789, (2006). Eliasson M. et al. Anal. Chem. 84(15), 6869-6876, (2012). Libiseller G. et al. BMC Bioinformatics 16:118, (2015). Bittremieux W. et al. J. Proteome Res. 14(5), 2360-2366, (2015). 51
Page 1 and 2: 10 th Benelux Bioinformatics Confer
Page 3 and 4: 10th Benelux Bioinformatics Confere
Page 19 and 20: BeNeLux Bioinformatics Conference -
Page 49: BeNeLux Bioinformatics Conference -
Page 101 and 102:
BeNeLux Bioinformatics Conference -
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115:
10th Benelux Bioinformatics Confere
show all

bbc 2015

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?