Annual Scientific Report 2015

Recommendations

Info

Non-vertebrate Genomics High-throughput sequencing is transforming both understanding and application of the biology of many organisms. Our team integrates, analyses and disseminates these data for scientists working in domains as diverse as agriculture, pathogen-mediated disease and the study of model organisms. We run services for bacterial, protist, fungal, plant and invertebrate metazoan genomes, mostly using the power of the Ensembl software suite, and usually in partnership with interested communities. In such collaborations we contribute to the development of many resources, including VectorBase (Giraldo- Calderon et al., 2015) for invertebrate vectors of human disease, WormBase (Howe et al. 2016) for nematode biology, PomBase (McDowall et al., 2015) for fission yeast Schizosaccharomyces pombe, and PhytoPath (Pedro et al. 2016) for plant pathogens. In the plant domain, we collaborate closely with Gramene in the US and with a range of European groups in the transPLANT and ELIXIR-EXCELERATE projects. By collaborating with EMBL-EBI and re-using our established toolset, small communities with little informatics infrastructure can perform and interpret highly complex and data-generative experiments— the type of work once the sole domain of large, internationally co-ordinated sequencing projects. We also work on large, complex genomes like hexaploid bread wheat, establishing informatics frameworks for the analysis of species for which genomic data is only now gaining traction as technologies improve. Our major activities include genome annotation, broad-range comparative genomics and the visualisation and interpretation of genomic variation, which is studied increasingly in species throughout the taxonomy. Major achievements In 2015 we issued six public releases of Ensembl Genomes. Ensembl Bacteria now includes almost 30 000 genomes from over 5000 distinct species; while the number of fungal and protist genomes included have increased approximately 10-fold and 5-fold, respectively, in one year. It is likely that we will deploy a similar, automated approach to that currently taken for incorporating microorganism genomes for those of multicellular species in 2016. With each release we have updated cross-references and comparative genomics, introduced improved assemblies and annotations, and sourced additional data sets, mapping them onto the relevant genomes and incorporating them into the resource. We contributed to the regular data releases of and PomBase, VectorBase, WormBase and PhytoPath. As part of VectorBase, we contributed to the publications of the genome of Anopheles stephensi, the primary mosquito vector of malaria in urban India. In WormBase, we made substantial progress towards the implementation of a new database framework that should allow for improved performance and more rapid updates to the public site. In both WormBase and PhytoPath, we released new data-mining solutions. In each project there are specific challenges, but by re-using infrastructural components in different contexts we have gained efficiencies of scale. Community curation is a good way of capturing high-value data from the experts. We are also now running community curation portals for 30 insect vector species using the Web Apollo framework, allowing scientists to modify gene models directly for subsequent incorporation into VectorBase and Ensembl. In PomBase, we collect functional annotations using the Canto tool. In 2015 we extended our use of Web Apollo to plant pathogens for the first time, working with the community to improve the annotation of the necrotrophic fungus Botrytis cinerea, and prepared to deploy Canto for these phytopathogenic species. In December, we released a new “pre-site” offering access to a new genomic assembly for bread wheat. Bread wheat has a large, complex genome and we have been working as part of a BBSRC-funded project to develop and disseminate a new assembly through a collaboration with The Genome Analysis Centre, The John Innes Centre, and Rothamsted Research. The new assembly is the most complete, contiguous assembly yet released for this species and we will be working to annotate it fully over the course of 2016. In the context of the transPLANT project, we continued to work with the plant science community to develop standards for phenotypic data, and set out our findings with a publication (Krajewski et al., 2015). 91 2015 EMBL-EBI Annual Scientific Report
Paul Kersey Non-vertebrate Genomics PhD University of Edinburgh 1992. Postdoctoral work at University of Edinburgh and MRC Human Genetics Unit, Edinburgh. At EMBL-EBI since 1999. Future plans A major project is currently underway to automate the identification and genomic alignment of RNA-seq data submitted to the European Nucleotide Archive. A new pipeline is expected to ensure that all appropriate data sets present in the archives can be visualised through the Ensembl interfaces. In 2016 we will address the issues arising from these development. For example, these efforts need to be matched with new approaches for storing, annotating, searching and visualising the resulting alignments. Indeed, there are similar challenges in allowing users easy navigation of large numbers of alignment “tracks” on an individual genome as there are to allowing easy navigation of large numbers of genomes. Neither problem is trivial to solve: likely solutions involve the increased use of programmatic access methods, coupled to tight collaboration with archival resources to ensure the capture of the metadata necessary to support the search functionality desired. Selected publications Ensembl Genomes 2016: more genomes, more complexity. Kersey PJ, Allen JE, Armean I, et al. (2016) Nucleic Acids Res. 44: D574-D580. Krajewski P, Chen D, Ćwiek H, et al. (2015) Towards recommendations for metadata and data handling in plant phenotyping. J. Exp. Bot. 66:5417-5427. Jiang X, Peery A, Hall AB, et al. (2015) Genome analysis of a major urban malaria vector mosquito, Anopheles stephensi. Genome Biol. 15:459. Pedro H, Maheswari U, Urban M, et al. (2016) PhytoPath: an integrative resource for plant pathogen genomics. Nucleic Acids Res. 44:D688-D693. WormBase 2016: expanding to enable helminth genomic research. Howe KL, Bolt BJ, Cain S, et al. (2016) Nucleic Acids Res. 44:D774-D780. Genomes for all: Ensembl Genomes provides a central resource for addressing all areas of biological research. 2015 EMBL-EBI Annual Scientific Report 92
Page 1 and 2:
The European Bioinformatics Institu
Page 3 and 4:
SERVICE TEAMS TRAINING PROGRAMME RE
Page 5 and 6:
Foreword We are pleased to present
Page 7 and 8:
awareness amongst some of our stron
Page 9 and 10:
Chemical biology The 17 million nov
Page 11 and 12:
The most extensive catalogue of str
Page 13 and 14:
“ EMBL -EBI services are the back
Page 15 and 16:
European Nucleotide Archive The ENA
Page 17 and 18:
Vertebrate Genomics Paul Flicek Bro
Page 19 and 20:
Functional Genomics Alvis Brazma
Page 21 and 22:
Pfam Pfam is a database of protein
Page 23 and 24:
Protein Data Bank in Europe Gerard
Page 25 and 26:
MetaboLights MetaboLights is a data
Page 27 and 28:
Proteomics Services and Molecular I
Page 29 and 30:
BioSamples The BioSamples database
Page 31 and 32:
“ EMBL -EBI is a critical mass of
Page 33 and 34:
EMBL International PhD Programme at
Page 35 and 36:
“ It would be a considerable loss
Page 37 and 38:
The Birney group used methods devel
Page 39 and 40:
Marioni group • Improved and exte
Page 41 and 42: “ Because I work for a micro biot
Page 43 and 44: Industry workshops • In silico AD
Page 45 and 46: The work of our institute relies on
Page 47 and 48: Web production Rodrigo Lopez System
Page 49 and 50: 2015 EMBL-EBI Annual Scientific Rep
Page 51 and 52: Capital investment Support from the
Page 53 and 54: In 2015 our core data resources con
Page 55 and 56: Joint publications Most of our 299
Page 57 and 58: One from Many: Perspectives on a Mu
Page 61 and 62: European Nucleotide Archive • Mar
Page 63 and 64: Technical Services Cluster Scientif
Page 65 and 66: Expression Atlas • Oregon State U
Page 67 and 68: Photo: Uma Maheswari 2015 EMBL-EBI
Page 71 and 72: 037. Chiapparino A, Maeda K, Turei
Page 73 and 74: 115. Jakubec D, Hostas J, Laskowski
Page 75 and 76: 192. Perez-Riverol Y, Xu QW, Wang R
Page 77 and 78: 269. van den Berg BA, Reinders MJ,
Page 79 and 80: Director Ewan Birney Admininstratio
Page 83 and 84: Guy Cochrane European Nucleotide Ar
Page 85 and 86: Vertebrate Genomics Research The mo
Page 87 and 88: Daniel Zerbino Ensembl Genome Analy
Page 89 and 90: Future plans We will continue to de
Page 91: Andy Yates Genome Technology and In
Page 95 and 96: Justin Paschall Variation Archive M
Page 97 and 98: Alvis Brazma Functional Genomics Ph
Page 99 and 100: Ugis Sarkans Functional Genomics De
Page 101 and 102: Robert Petryszak Gene Expression MP
Page 103 and 104: Rob Finn Sequence Families PhD in B
Page 105 and 106: Maria-Jesus Martin Protein Function
Page 107 and 108: Claire O’Donovan Protein Function
Page 109 and 110: (such as the on-going EMDataBank Ma
Page 111 and 112: Sameer Velankar PDBe Content and In
Page 113 and 114: containing the mapping between comp
Page 115 and 116: of 14 leading European labs in Meta
Page 117 and 118: Henning Hermjakob Proteomic service
Page 119 and 120: coimmunoprecipitation coimmunopreci
Page 121 and 122: development of Europe PMC as a plat
Page 123 and 124: Mouse informatics In 2015 we contin
Page 127 and 128: Train online, EMBL-EBI’s web-base
Page 129 and 130: Nils Koelling Quantitative genetics
Page 133 and 134: Pedro Beltrao PhD in Biology, Unive
Page 135 and 136: Ewan Birney PhD 2000, Wellcome Trus
Page 137 and 138: Anton Enright PhD in Computational
Page 139 and 140: Nick Goldman PhD University of Camb
Page 141 and 142: John Marioni PhD in Applied Mathema
Page 143 and 144:
Julio-Saez Rodriguez PhD University
Page 145 and 146:
Oliver Stegle PhD in Physics, Unive
Page 147 and 148:
Future plans The Teichmann group wi
Page 149 and 150:
findings regarding association were
Page 151 and 152:
2015 EMBL-EBI Annual Scientific Rep
Page 153 and 154:
Future plans The Industry Programme
Page 155 and 156:
2015 EMBL-EBI Annual Scientific Rep
Page 157 and 158:
Reporting on usage We further devel
Page 159 and 160:
to find the support they need. The
Page 161 and 162:
Petteri Jokinen Systems & Networkin
Page 163 and 164:
Standby Facility and Database Disas
Page 165 and 166:
External Relations leads on brand a
Page 167 and 168:
Mark Green EMBL-EBI Administration
show all

Annual Scientific Report 2015

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?