Annual Scientific Report 2015
EMBL_EBI_ASR_2015_DigitalEdition
EMBL_EBI_ASR_2015_DigitalEdition
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Sameer Velankar<br />
PDBe Content and Integration<br />
PhD, Indian Institute of Science, 1997.<br />
Postdoctoral researcher, Oxford University,<br />
United Kingdom, 1997-2000.<br />
wwPDB validation pipeline, users can now identify “best<br />
quality” structures for given macromolecules based on<br />
our newly designed data-access mechanisms.<br />
We implemented new data query systems and web<br />
pages in a user-centric approach, with user surveys and<br />
input at critical stages of the development. We initiated<br />
user involvement with a survey to establish essential<br />
requirements. The results of the survey and feedback on<br />
early prototypes informed the design of the new query<br />
system and web pages. We then developed new ways of<br />
presenting structure data to help non-expert biologists<br />
understand the structure information. A set of images<br />
providing rich insights into the quaternary structure<br />
or ligand-binding sites, or displaying sequence and<br />
structure domain annotations, now helps users with<br />
different levels of expertise understand the structure<br />
data available in the PDB. We integrated interactive<br />
web components showing annotations on sequence<br />
(1D) and structure (3D) data on the new entry pages.<br />
When user feedback suggested that general biologists<br />
struggled with these displays, we implemented an<br />
interactive topology (2D) viewer to provide a more<br />
seamless link between sequence-based data and 3D<br />
structure. The viewer displays such information on the<br />
topology diagram, which provides a schematic view of<br />
the arrangement of secondary structure elements in<br />
the tertiary structure. These diagrams are a result of a<br />
collaboration between PDBe and Dr Roman Laskowski<br />
from the Thornton research group.<br />
The new query system provides an easy-to-use interface<br />
and basic data-analysis capabilities. It allows users<br />
to browse structure entries in the archive and offers<br />
filters for “drilling down” to a subset of entries. It also<br />
allows users to identify the most suitable structure<br />
from a given set by grouping entries based on unique<br />
macromolecules, small molecules and sequence<br />
families, displaying the “best” quality structure based on<br />
wwPDB validation data.<br />
Our new search system offers extended functionality<br />
developed in the BioSolr project, a close collaboration<br />
between PDBe, the Samples, Phenotypes and Ontologies<br />
team and Flax, a Cambridge-based search technology<br />
company. A new plugin developed by BioSolr to integrate<br />
external programs now enables users to query PDB<br />
entries based on sequence. Following usability testing,<br />
the new feature will be released in 2016.<br />
The team also publicly released a REST API to provide<br />
both query access and entry information for PDB and<br />
EMDB entries. The widely used 3D structure viewer<br />
Jmol/JSmol and the interactive sequence-analysis<br />
application JalView both make use of the API. It is used<br />
internally to ensure data consistency across all PDBe<br />
services, and provides access to value-added annotation<br />
from the SIFTS resource and data-quality information<br />
from the wwPDB validation data.<br />
At EMBL-EBI since 2000.<br />
Team leader since 2011.<br />
The wwPDB now uses the infrastructure<br />
developed in 2014 to integrate small-molecule crystal<br />
structures from the Cambridge Structural Database<br />
(CSD) into the PDB Chemical Component Dictionary<br />
(CCD). In <strong>2015</strong>, a file containing information on 1418<br />
chemical components found to be common between the<br />
CCD and CSD was released in the wwPDB public ftp<br />
area.<br />
Future plans<br />
We will continue to work on the wwPDB common<br />
deposition and annotation system to extend its<br />
functionality to include more experimental techniques.<br />
A major update, version 2.0, will be implemented at all<br />
wwPDB sites in 2016. This will allow PDBe to process<br />
all European and African depositions for both PDB and<br />
EMDB. We plan to continue improving the data quality<br />
of PDB and EMDB archive entries. We will also adapt<br />
PDBe services that provide value-added annotations and<br />
data analysis so that they can be integrated into the new<br />
infrastructure. We will implement major improvements<br />
to the PDBe query interface and entry pages, responding<br />
to user needs in terms of functionality and data access.<br />
As additional value-added information becomes<br />
available, we will extend the PDBe REST API so that the<br />
information can be accessed programmatically. To make<br />
our users aware of new developments, we will continue<br />
to participate in and organise international conferences,<br />
workshops and training events, engage on social media<br />
and publish scholarly articles.<br />
Selected publications<br />
Gutmanas A, et al. (<strong>2015</strong>) NMR Exchange Format: a<br />
unified and open standard for representation of NMR<br />
restraint data. Nat. Struct. Mol. Biol. 22:433-434<br />
Lewis TE, et al. (<strong>2015</strong>) Genome3D: exploiting structure<br />
to help users understand their sequences. Nucleic Acids<br />
Res. 43:D382-D386<br />
Meldal BH, et al. (<strong>2015</strong>) The complex portal--an<br />
encyclopaedia of macromolecular complexes. Nucleic<br />
Acids Res. 43:D479-D484<br />
Sali A, et al. (<strong>2015</strong>) Outcome of the First wwPDB hybrid/<br />
integrative methods task force workshop. Structure<br />
23:1156-1167<br />
Westbrook JD, et al. (<strong>2015</strong>) The chemical component<br />
dictionary: complete descriptions of constituent<br />
molecules in experimentally determined 3D<br />
macromolecules in the Protein Data Bank.<br />
Bioinformatics 31:1274-1278<br />
<strong>2015</strong> EMBL-EBI <strong>Annual</strong> <strong>Scientific</strong> <strong>Report</strong> 110