22.08.2016 Views

Annual Scientific Report 2015

EMBL_EBI_ASR_2015_DigitalEdition

EMBL_EBI_ASR_2015_DigitalEdition

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Vertebrate Genomics<br />

Paul Flicek<br />

Bronwen Aken<br />

Andrew Yates<br />

Daniel Zerbino<br />

• Welcomed three new Team Leaders in September: Bronwen<br />

Aken, Andrew Yates and Daniel Zerbino;<br />

• Updated the data from several major projects (e.g. 1000<br />

Genomes Project, BLUEPRINT) to reflect the new GRCh38<br />

human reference assembly;<br />

• Issued five major releases of Ensembl, and provided updates<br />

to other highly used resources, e.g. human (now assembly v.<br />

GRCh37.p8) and mouse (now GRC38m.p4) genomes;<br />

• Published the Ensembl regulatory build and the genome of<br />

the vervet monkey;<br />

• Released important annotation updates to the rat (Rnor_6.0)<br />

and zebrafish (GRCz10) genome assemblies, and introduced<br />

a dynamic gene gain/loss view of these datasets;<br />

• Released BLUEPRINT data via GenomeStats, a web-based<br />

tool for carrying out analyses of epigenomic data;<br />

• Released the beta version of our TrackHub registry;<br />

• Developed new views and tools, enhanced performance and<br />

usability of existing views, extended support for track hubs,<br />

and improved our mirror sites;<br />

• Developed a new BioMart system to provide fast access to all<br />

regulatory data from the new Ensembl Regulatory Build and<br />

added new bindings in Bioconductor;<br />

• Helped launch the Functional Annotation of Animal<br />

Genomes (FAANG) project, in which we lead efforts to define<br />

data and metadata standards;<br />

• Upgraded the HipSci project website to improve the<br />

discoverability of individual cell lines and related data;<br />

• Improved display of variation data tables and introduced<br />

Manhattan plots for linkage disequilibrium data;<br />

• Managed the growth in usage of the Ensembl REST API,<br />

which had over 70 million requests in <strong>2015</strong>;<br />

• Introduced a new visualisation tool for long-range<br />

connections between genomic regions;<br />

• Promoted Ensembl resources through social media,<br />

conferences, webinars and 97 workshops;<br />

• Helped complete the relocation of the GWAS Catalog<br />

software infrastructure from the NHGRI in the US to<br />

EMBL-EBI;<br />

• Improved the GWAS Catalog website by updating the search<br />

interface with SOLR technology and supporting ontology<br />

expansion queries.<br />

Non-vertebrate Genomics<br />

Paul Kersey<br />

• Issued six public releases of Ensembl Genomes;<br />

• Contributed to the regular data releases of Vector Base,<br />

Wormbase and PomBase;<br />

• Increased the number of bacterial genomes available through<br />

the Ensembl public interface to nearly 30 000;<br />

• Increased the number of fungal genomes 10-fold and protist<br />

genomes 5-fold;<br />

• Made major contributions to the paper describing the<br />

genome of Anopheles stephensi, the primary mosquito<br />

vector of malaria in urban India;<br />

• Extended community curation to plant pathogens,<br />

and released new data-mining tools for PhytoPath and<br />

WormBase ParaSite;<br />

• Issued a new, more contiguous and complete assembly of the<br />

bread wheat genome to the research community.<br />

Variation<br />

Justin Paschall and Helen Parkinson<br />

• Handled a 50% increase in the volume of data archived in<br />

the European Genome-phenome Archive (EGA) and a 65%<br />

increase in the number of files submitted;<br />

• Deployed a new EGA downloader service, which distributed<br />

over 1.7 Petabytes of data;<br />

• Implemented a Global Alliance for Genomics and Health<br />

‘Beacon’ for the EGA, enabling users to access a limited<br />

collection of variation data through a single, three-tiered<br />

entry point;<br />

• Re-built the EGA pipeline, reducing the quarterly average<br />

processing time from three weeks to one and a half days;<br />

• In collaboration with colleagues at CRG Barcelona, increased<br />

the EGA’s capacity to distribute data via FTP, Aspera and a<br />

customised downloader;<br />

• Managed the growth of the European Variation Archive to<br />

22 datasets on various organisms, including crop species<br />

and domesticated animals;<br />

• Made available datasets from Phase 3 of the 1000 Genomes<br />

Project and from the Exome Aggregation Consortium<br />

(ExAC);<br />

• Improved the EVA browser by integrating variant<br />

annotations generated by the Ensembl Variant Effect<br />

Predictor tool and applying advanced search filters;<br />

• Improved the representation of clinical information with a<br />

new display for data from ClinVar;<br />

• Helped standardise ClinVar data in the context of Open<br />

Targets (formerly CTTV) and developed global standards for<br />

variation data as part of the GA4GH.<br />

<strong>2015</strong> EMBL-EBI <strong>Annual</strong> <strong>Scientific</strong> <strong>Report</strong> 15

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!