EMBL-EBI Annual Scientific Report 2012

More documents

Recommendations

Info

Proteomics servicesThe Proteomics Services team develops tools and resources for therepresentation, deposition, distribution and analysis of proteomics andsystems biology data.We follow an open-source, open-data approach: all of the resources we develop are freely available. The team is a majorcontributor to community standards, in particular the Proteomics Standards Initiative (PSI) of the international HumanProteome Organisation (HUPO) and systems biology standards (COMBINE Network). We provide public databases asreference implementations for community standards: the PRIDE proteomics identifications database, the IntAct molecularinteraction database, the Reactome pathway database and BioModels Database, a repository of computational models ofbiological systems.As a result of long-term engagement with the community, journal editors and funding organisations, data deposition in ourstandards-compliant data resources is becoming a strongly recommended part of the publishing process. This has resulted ina rapid increase in the data content of our resources. Our curation teams ensure consistency and appropriate annotation of alldata, whether from direct depositions or literature curation, to provide the community with high-quality reference datasets.We also contribute to the development of data integration technologies, using protocols like Distributed Annotation System(DAS) and semantic web technologies, and provide stable identifiers for biomolecular entities through identifiers.org.Major achievementsA major success in 2012 was achieving full production modefor the ProteomeXchange consortium. In this EU-fundedconsortium, PRIDE works with a number of internationalpartners (e.g., PeptideAtlas, UniProt, University of Ghent,University of Liverpool, ETH Zurich, University of Michigan,Wiley-VCH) to co-ordinate data deposition and disseminationstrategies for mass spectrometry data, providing a singleentry point for data deposition, a shared accession numberspace and a deposition metadata format. In spring 2012,ProteomeXchange started full production and achieved gooduser acceptance, reaching 77 submissions with more than 20million spectra by November 2012 (see Figure).ProteomeXchange submission strategies include thecapability to deposit large raw datasets, and in 2012 wecompletely redeveloped PRIDE data deposition support,publishing Java libraries (Griss et al, 2012; Reisinger et al,2012), an updated PRIDE Converter submission tool (Côté etal, 2012), and the PRIDE Inspector data access tool (Wang etal, 2012).IMEx is an international consortium of major molecularinteraction data providers that globally synchronise theirdata deposition and curation efforts (Orchard et al, 2012).IMEx partners share formats, identifier spaces and curationstrategies and directly share the web-based IntAct curationinfrastructure (Kerrien et al., 2012). This avoids redundantdevelopment while retaining the value of each individualresource. IMEx partners include UniProt (Switzerland andthe United Kingdom), I2D (Canada), InnateDB (Ireland andCanada), Molecular Connections (India) and MechanoBio(Singapore).Reactome provides review-style, curated and peer-reviewedhuman pathways in a computationally accessible form. In2012 we collaborated with the Ouwehand group to providea detailed update of platelet-related pathways (Jupe et al,2012), as well as many other high-profile pathways. Wecurate disease variants of normal physiological pathways toprovide users with information about mutations and how theyaffect pathways. We also focus on cancer-relatedsignalling pathways.Our BioModels team contributed to a large-scale collaborativeeffort for the semi-automated generation of systems biologymodels based on KEGG pathways, Biocarta, MetaCyc andSABIO-RK data. More than 142 000 models, covering 1852species, were generated, opening up new opportunities46 2012 EMBL-EBI Annual Scientific Report
Henning HermjakobMSc Bioinformatics University of Bielefeld, Germany,1995. Research Assistant at the German NationalCentre for Biotechnology (GBF), 1996.At EMBL-EBI since 1997.to explore and refine models. We rose to the challenge ofintegrating these models into BioModels Database in 2012,increasing the number of available models by two orders ofmagnitude.Future plansFollowing the upgrade of the PRIDE submission system, wehave turned our attention to redeveloping the core databaseand web interface. This is essential for maintaining goodresponse times and helping users access increasingly largeand complex proteomics datasets. We will begin providingquality-controlled subsets and derived datasets from PRIDE,evolving this resource from a primary database into a systemsbiology source of protein expression data.IntAct will increasingly provide confidence-scored interactiondatasets derived from integration of individual publications.The same strategies will be applied, where possible, tointeraction data from multiple sources, which we accessthrough the PSICQUIC interface. This will ensure we canprovide integrated, up-to-date interaction datasets. We willredesign the IntAct website in 2013 to improve accessibility.In 2013 we will make IntAct datasets accessible throughReactome. This will enhance Reactome’s visualisation ofmolecular interactions in the context of molecular pathways.The next Reactome website release will include a newinteractive pathway viewer, and will feature improved overlayof external information such as expression data. Reactomecuration will focus on disease-induced modifications ofpathways, supported by improved visualisation toolsand close collaboration with disease-oriented researchcommunities.We will develop the new storage infrastructure for theBioModels Database so that the resource can cope with theinflux of computationally generated models. We will developnew features (e.g., full model versioning, support for moremodelling formats), update the user interface, enhance searchfacilities and improve overall performance.We will continue to work with journals, editors and dataproducers to make more data publicly available by utilisingcommunity-supported standards.Figure. Distribution of ProteomeXchange submissions in 20122012 EMBL-EBI Annual Scientific Report47
Page 1 and 2: EMBL-European Bioinformatics Instit
Page 3: Table of contentsIntroduction & ove
Page 6 and 7: EMBL-EBI 2012It was a year of trans
Page 8 and 9: New service developments• Underst
Page 10 and 11: Organisation ofEMBL-EBI Leadership
Page 12 and 13: dGenes, genomes and variationThe Eu
Page 14 and 15: dGenes, genomes and variationSummar
Page 16 and 17: European Nucleotide ArchiveOur team
Page 18 and 19: Vertebrate genomicsThe Vertebrate G
Page 20 and 21: Nonvertebrate genomicsWe provide to
Page 22 and 23: gMolecular atlasLife scientists are
Page 24 and 25: Functional genomicsThe Functional G
Page 26 and 27: Functional genomics productionOur t
Page 28 and 29: Functional genomics developmentOur
Page 30 and 31: PProteins and protein familiesUniPr
Page 32 and 33: UniProt contentOne of the central a
Page 34 and 35: UniProt developmentOur team provide
Page 36 and 37: InterProOur team co-ordinates the I
Page 38 and 39: sMolecular and cellular structureUn
Page 40 and 41: Protein Data Bank in EuropeThe majo
Page 42 and 43: PDBe content and integrationOur goa
Page 44 and 45: PDBe databases and servicesOur team
Page 46 and 47: yMolecular systemsThe genes and gen
Page 50 and 51: Chemical biologyThe importance of s
Page 52 and 53: ChEMBLThe ChEMBL team develops and
Page 54 and 55: Cheminformatics and metabolismOur t
Page 56 and 57: cCross-domain toolsand resourcesSci
Page 58 and 59: Literature servicesScientific liter
Page 60 and 61: Research2012 has seen the further t
Page 62 and 63: Bertone groupPluripotency, reprogra
Page 64 and 65: Birney groupNucleotide dataDNA sequ
Page 66 and 67: Enright groupFunctional genomics an
Page 68 and 69: Goldman groupEvolutionary tools for
Page 70 and 71: Le Novère groupComputational syste
Page 72 and 73: Luscombe groupGenomics and regulato
Page 74 and 75: Marioni groupComputational and evol
Page 76 and 77: Rebholz groupPhenotypes and multili
Page 78 and 79: Saez-Rodriguez groupSystems biomedi
Page 80 and 81: Thornton groupProteins: structure,
Page 82 and 83: The EMBL International PhDProgramme
Page 84 and 85: SupportOur support teams provide fo
Page 86 and 87: T TrainingAs part of EMBL-EBI’s m
Page 88 and 89: IIndustry programmeSince 1996 the I
Page 90 and 91: NExternal relationsAs a European In
Page 92 and 93: sExternal servicesOur team manages
Page 94 and 95: SSystems and networkingOur team man
Page 96 and 97: q AdministrationThe EMBL-EBI Admini
Page 98 and 99:
Funding and resource allocationDesp
Page 100 and 101:
Growth of core resourcesIn 2012 the
Page 102 and 103:
CollaborationsEMBL-EBI is a highly
Page 104 and 105:
Staff growthOur organisational stru
Page 106 and 107:
Scientific advisory commiteesEMBL S
Page 108 and 109:
The International Nucleotide Sequen
Page 110 and 111:
EMDataBank Advisory Committee• Jo
Page 112 and 113:
Major database collaborationsARRAYE
Page 114 and 115:
THE GENE ONTOLOGY CONSORTIUM• Agb
Page 116 and 117:
REACTOME• New York University Med
Page 118 and 119:
Publications in 2012In 2012, EMBL-E
Page 120 and 121:
Doreleijers, J. F., Vranken W. F.,
Page 122 and 123:
Kruger, F. A., Rostom R. and Overin
Page 124 and 125:
Sahakyan, Aleksandr B., Cavalli And
Page 128:
EMBL - European Bioinformatics Inst
show all

EMBL-EBI Annual Scientific Report 2012

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?