21.11.2014 Views

ayout 1 - EMBL Grenoble

ayout 1 - EMBL Grenoble

ayout 1 - EMBL Grenoble

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>EMBL</strong> Research at a Glance 2009<br />

Peter Rice<br />

BSc 1976, University of<br />

Liverpool.<br />

Previously at <strong>EMBL</strong><br />

Heidelberg (1987–199), the<br />

Sanger Centre (199–2000)<br />

and LION Bioscience (2000–<br />

2002).<br />

Team leader at <strong>EMBL</strong>-EBI<br />

since 2003.<br />

Grid and e-Science research and development<br />

Previous and current research<br />

The team’s focus is on the integration of bioinformatics tools and data resources. We have the<br />

remit to investigate and advise on the e-Science and Grid technology requirements of <strong>EMBL</strong>-EBI,<br />

through application development, training exercises and participation in international projects<br />

and standards development. Our group is responsible for the EMBOSS open source sequence<br />

analysis package, the Taverna bioinformatics workflow system (originally developed as part of the<br />

myGrid UK e-Science project) and for various projects (including EMBRACE and ComparaGrid)<br />

that integrate access to bioinformatics tools and data content.<br />

To date, Grid development has focussed on the basic issues of storage, computation and resource<br />

management needed to make a global scientific community’s information and tools accessible in<br />

a high-performance environment. However, from the e-Science point of view, the purpose of the<br />

Grid is to deliver a collaborative and supportive environment that enables geographically distributed<br />

scientists to achieve research goals more effectively, while allowing their results to be used in<br />

developments elsewhere.<br />

Our group has been the biological specialist participant in the UK-funded myGrid project and<br />

this collaboration is continuing with the Open Middleware Infrastructure Institute (OMII-UK).<br />

This project was aimed at developing and maintaining open source high-level service-based middleware<br />

to support the construction, management and sharing of data-intensive in silico experiments<br />

in biology. <strong>EMBL</strong>-EBI’s role is through the Taverna workbench and as an application and data service developer and provider which<br />

continues through the EMBRACE and EMBOSS projects.<br />

A key factor in the success of EMBOSS, and in particular its selection as the application platform for the EMBRACE and myGrid projects,<br />

has been its development and implementation of the AJAX Command Definition standard or ACD files. These define the interface of each<br />

EMBOSS application and are directly used by the application on startup for all processing of the command line and interaction with the user.<br />

The EMBRACE project, an EU-funded Network of Excellence, is now in its second year, with the aim of defining and implementing a consistent<br />

standard interface to integrate data content and analysis tools across all <strong>EMBL</strong>-EBI core databases and those provided by our partners.<br />

The early focus of this five-year project has been on the sequence and structure data resources at EBI and the EMBOSS applications. Our group<br />

is also active in defining the core technologies to be used by EMBRACE, including BioMart data federation methods, web services provided<br />

by the EBI External Services team, and the Taverna workbench as an end-user client.<br />

Future projects and goals<br />

The services provided by the group remain largely SOAP-based web services. These have proved to be highly useful to prototype and develop<br />

service and metadata standards. We are looking, especially through the EMBRACE project, to migrate to true Grid services, but like many<br />

other groups we are waiting for the long-anticipated merging of web and grid service standards.<br />

The EMBOSS project plans to expand in the coming few years to cover bioinformatics more generally, including genomics, protein structure,<br />

gene expression, proteomics, phylogenetics, genetics and biostatistics. This will require the participation of external groups to expand the project<br />

beyond its current EBI base, and we are actively seeking potential partners in each area. We will expect to build a service-based e-Science<br />

architecture around the applications and data resources through the EMBRACE project, with support and guidance from the community of<br />

users in academia and industry.<br />

The EMBRACE project will move beyond sequence data and analysis services to cover the remaining areas of the EBI’s core databases and to<br />

integrate services from our partners using the same standards and interfaces.<br />

Selected references<br />

Belhajjame, K. et al. (2008). Metadata management in the taverna<br />

workflow system. In ‘Proceedings CCGRID 2008 – 8th IEEE<br />

International Symposium on Cluster Computing and the Grid’, 651-<br />

656<br />

Lanzen, A. & Oinn, T. (2008). The Taverna Interaction Service:<br />

Enabling manual interaction in workflows. Bioinformatics, 2, 1118-<br />

1120<br />

Li, P. et al. (2008). Automated manipulation of systems biology<br />

models using libSBML within Taverna workflows. Bioinformatics, 2,<br />

287-289<br />

Li, P. et al. (2008). Performing statistical analyses on quantitative<br />

data in Taverna workflows: An example using R and maxdBrowse to<br />

identify differentially-expressed genes from microarray data. BMC<br />

Bioinformatics, 9, Article 33<br />

82

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!