22.08.2016 Views

Annual Scientific Report 2015

EMBL_EBI_ASR_2015_DigitalEdition

EMBL_EBI_ASR_2015_DigitalEdition

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Sameer Velankar<br />

PDBe Content and Integration<br />

PhD, Indian Institute of Science, 1997.<br />

Postdoctoral researcher, Oxford University,<br />

United Kingdom, 1997-2000.<br />

wwPDB validation pipeline, users can now identify “best<br />

quality” structures for given macromolecules based on<br />

our newly designed data-access mechanisms.<br />

We implemented new data query systems and web<br />

pages in a user-centric approach, with user surveys and<br />

input at critical stages of the development. We initiated<br />

user involvement with a survey to establish essential<br />

requirements. The results of the survey and feedback on<br />

early prototypes informed the design of the new query<br />

system and web pages. We then developed new ways of<br />

presenting structure data to help non-expert biologists<br />

understand the structure information. A set of images<br />

providing rich insights into the quaternary structure<br />

or ligand-binding sites, or displaying sequence and<br />

structure domain annotations, now helps users with<br />

different levels of expertise understand the structure<br />

data available in the PDB. We integrated interactive<br />

web components showing annotations on sequence<br />

(1D) and structure (3D) data on the new entry pages.<br />

When user feedback suggested that general biologists<br />

struggled with these displays, we implemented an<br />

interactive topology (2D) viewer to provide a more<br />

seamless link between sequence-based data and 3D<br />

structure. The viewer displays such information on the<br />

topology diagram, which provides a schematic view of<br />

the arrangement of secondary structure elements in<br />

the tertiary structure. These diagrams are a result of a<br />

collaboration between PDBe and Dr Roman Laskowski<br />

from the Thornton research group.<br />

The new query system provides an easy-to-use interface<br />

and basic data-analysis capabilities. It allows users<br />

to browse structure entries in the archive and offers<br />

filters for “drilling down” to a subset of entries. It also<br />

allows users to identify the most suitable structure<br />

from a given set by grouping entries based on unique<br />

macromolecules, small molecules and sequence<br />

families, displaying the “best” quality structure based on<br />

wwPDB validation data.<br />

Our new search system offers extended functionality<br />

developed in the BioSolr project, a close collaboration<br />

between PDBe, the Samples, Phenotypes and Ontologies<br />

team and Flax, a Cambridge-based search technology<br />

company. A new plugin developed by BioSolr to integrate<br />

external programs now enables users to query PDB<br />

entries based on sequence. Following usability testing,<br />

the new feature will be released in 2016.<br />

The team also publicly released a REST API to provide<br />

both query access and entry information for PDB and<br />

EMDB entries. The widely used 3D structure viewer<br />

Jmol/JSmol and the interactive sequence-analysis<br />

application JalView both make use of the API. It is used<br />

internally to ensure data consistency across all PDBe<br />

services, and provides access to value-added annotation<br />

from the SIFTS resource and data-quality information<br />

from the wwPDB validation data.<br />

At EMBL-EBI since 2000.<br />

Team leader since 2011.<br />

The wwPDB now uses the infrastructure<br />

developed in 2014 to integrate small-molecule crystal<br />

structures from the Cambridge Structural Database<br />

(CSD) into the PDB Chemical Component Dictionary<br />

(CCD). In <strong>2015</strong>, a file containing information on 1418<br />

chemical components found to be common between the<br />

CCD and CSD was released in the wwPDB public ftp<br />

area.<br />

Future plans<br />

We will continue to work on the wwPDB common<br />

deposition and annotation system to extend its<br />

functionality to include more experimental techniques.<br />

A major update, version 2.0, will be implemented at all<br />

wwPDB sites in 2016. This will allow PDBe to process<br />

all European and African depositions for both PDB and<br />

EMDB. We plan to continue improving the data quality<br />

of PDB and EMDB archive entries. We will also adapt<br />

PDBe services that provide value-added annotations and<br />

data analysis so that they can be integrated into the new<br />

infrastructure. We will implement major improvements<br />

to the PDBe query interface and entry pages, responding<br />

to user needs in terms of functionality and data access.<br />

As additional value-added information becomes<br />

available, we will extend the PDBe REST API so that the<br />

information can be accessed programmatically. To make<br />

our users aware of new developments, we will continue<br />

to participate in and organise international conferences,<br />

workshops and training events, engage on social media<br />

and publish scholarly articles.<br />

Selected publications<br />

Gutmanas A, et al. (<strong>2015</strong>) NMR Exchange Format: a<br />

unified and open standard for representation of NMR<br />

restraint data. Nat. Struct. Mol. Biol. 22:433-434<br />

Lewis TE, et al. (<strong>2015</strong>) Genome3D: exploiting structure<br />

to help users understand their sequences. Nucleic Acids<br />

Res. 43:D382-D386<br />

Meldal BH, et al. (<strong>2015</strong>) The complex portal--an<br />

encyclopaedia of macromolecular complexes. Nucleic<br />

Acids Res. 43:D479-D484<br />

Sali A, et al. (<strong>2015</strong>) Outcome of the First wwPDB hybrid/<br />

integrative methods task force workshop. Structure<br />

23:1156-1167<br />

Westbrook JD, et al. (<strong>2015</strong>) The chemical component<br />

dictionary: complete descriptions of constituent<br />

molecules in experimentally determined 3D<br />

macromolecules in the Protein Data Bank.<br />

Bioinformatics 31:1274-1278<br />

<strong>2015</strong> EMBL-EBI <strong>Annual</strong> <strong>Scientific</strong> <strong>Report</strong> 110

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!