03.12.2015 Views

bbc 2015

BBC2015_booklet

BBC2015_booklet

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

BeNeLux Bioinformatics Conference – Antwerp, December 7-8 <strong>2015</strong><br />

Abstract ID: P<br />

Poster<br />

10th Benelux Bioinformatics Conference <strong>bbc</strong> <strong>2015</strong><br />

P66. PLADIPUS EMPOWERS UNIVERSAL DISTRIBUTED COMPUTING<br />

Kenneth Verheggen 1,2,3* , Harald Barsnes 4,5 , Lennart Martens 1,2,3 & Marc Vaudel 4 .<br />

Medical Biotechnology Center, VIB, Ghent, Belgium 1 ; Department of Biochemistry, Ghent University, Ghent 2 ;<br />

Belgium,Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium 3 ; Proteomics Unit, Department of<br />

Biomedicine, University of Bergen, Norway 4 ; KG Jebsen Center for Diabetes Research, Department of Clinical Science,<br />

University of Bergen, Norway 5 . *kenneth.verheggen@vib-ugent.be<br />

The use of proteomics bioinformatics substantially contributes to an improved understanding of proteomes, but this novel<br />

and in-depth knowledge comes at the cost of increased computational complexity. Parallelization across multiple<br />

computers, a strategy termed distributed computing, can be used to handle this increased complexity. However, setting<br />

up and maintaining a distributed computing infrastructure requires resources and skills that are not readily available to<br />

most research groups.<br />

Here, we propose a free and open source framework named Pladipus that greatly facilitates the establishment of<br />

distributed computing networks for proteomics bioinformatics tools.<br />

INTRODUCTION<br />

Various modern day bioinformatics-related fields have a<br />

growing focus on large scale data processing. This<br />

inevitably leads to an increased complexity, as is<br />

illustrated by the recent efforts to elaborate a<br />

comprehensive MS-based human proteome<br />

characterization (Kim et al., 2014; Wilhelm et al., 2014).<br />

Such high-throughput, complex studies are becoming<br />

increasingly popular, but require high performance<br />

computational setups in order to be analyzed swiftly.<br />

METHODS<br />

Here, we present a generic platform for distributed<br />

proteomics software, called Pladipus. It provides an<br />

end-user-oriented solution to distribute<br />

bioinformatics tasks over a network of computers,<br />

managed through an intuitive graphical user interface<br />

(GUI).<br />

Pladipus comes with several modules that work out<br />

of the box. They include SearchGUI (Vaudel et al.,<br />

2011), PeptideShaker (Vaudel et al., <strong>2015</strong>),<br />

DeNovoGUI (Muth et al., 2014), MsConvert (part of<br />

Proteowizard (Kessner et al., 2008)) and three<br />

common forms of the BLAST (Altschul et al., 1990)<br />

algorithm (blastn, blastp and blastx). It is possible to<br />

link these together to set up tailored pipelines for<br />

specific needs, including custom, in-house<br />

algorithms and execute the whole on an inexpensive,<br />

scalable cluster infrastructure without additional cost<br />

or expert maintenance requirement. It can even be set<br />

up to allow existing (idle) hardware to hook into the<br />

network and participate in the processing.<br />

RESULTS & DISCUSSION<br />

To numerically assess the benefits of using a distributed<br />

computing framework, 52 CPTAC experiments (LTQ-<br />

Study6 : Orbitrap@86) (Paulovich et al., 2010) were<br />

searched three times against a protein sequence database<br />

(UniProtKB/SwissProt (release-<strong>2015</strong>_05)) on Pladipus<br />

networks of various. A selection of three search engines<br />

was applied: X!Tandem, Tide and MS-GF+. As expected<br />

for a distributed system, the wall time is very reproducible<br />

and decreased nearly exponentially with the number of<br />

workers.<br />

FIGURE 1. Benchmarking of a Pladipus network<br />

(16GB ram, 12cores, 250GB disk space, Ubuntu<br />

precise)<br />

Pladipus is freely available as open<br />

source under the permissive Apache2<br />

license. Documentation, including<br />

example files, an installer and a video tutorial, can be<br />

found at<br />

https://compomics.github.io/projects/pladipus.html.<br />

REFERENCES<br />

Altschul,S.F. et al. (1990) Basic local alignment search tool. J. Mol.<br />

Biol., 215, 403–10.<br />

Kessner,D. et al. (2008) ProteoWizard: open source software for rapid<br />

proteomics tools development. Bioinformatics, 24, 2534–6.<br />

Kim,M.-S. et al. (2014) A draft map of the human proteome. Nature,<br />

509, 575–81.<br />

Muth,T. et al. (2014) DeNovoGUI: an open source graphical user<br />

interface for de novo sequencing of tandem mass spectra. J.<br />

Proteome Res., 13, 1143–6.<br />

Paulovich,A.G. et al. (2010) Interlaboratory study characterizing a yeast<br />

performance standard for benchmarking LC-MS platform<br />

performance. Mol. Cell. Proteomics, 9, 242–54.<br />

Vaudel,M. et al. (<strong>2015</strong>) PeptideShaker enables reanalysis of MS-derived<br />

proteomics data sets. Nat. Biotechnol., 33, 22–24.<br />

Vaudel,M. et al. (2011) SearchGUI: An open-source graphical user<br />

interface for simultaneous OMSSA and X!Tandem searches.<br />

Proteomics, 11, 996–9.<br />

Wilhelm,M. et al. (2014) Mass-spectrometry-based draft of the human<br />

proteome. Nature, 509, 582–7.<br />

110

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!