08.02.2013 Views

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

156CHAPTER 6. PROTEO<strong>MI</strong>CS.NET - PRODUCT-ORIENTED CASE STUDIES<br />

<strong>the</strong> plat<strong>for</strong>m, get <strong>the</strong> worker status in<strong>for</strong>mation and display it can be written<br />

in less than 20 lines as <strong>the</strong> following listing 6.1 shows:<br />

Listing 6.1: Minimal code <strong>for</strong> using <strong>the</strong> “get current worker()” web-service in Java.<br />

1 import org . apache . a x i s . c l i e n t . C a l l ;<br />

2 import org . apache . a x i s . c l i e n t . S e r v i c e ;<br />

3 import javax . xml . namespace .QName;<br />

4 import de . f u b e r l i n . mi . proteomics . p r o t e o m i c s n e t w e b s e r v i c e s . � ;<br />

5<br />

6 public class c l i e n t j a v a {<br />

7 public s t a t i c void main ( S t r i n g [ ] a r g s ) {<br />

8 try<br />

9 {<br />

10 S e r v i c e s i n f o L o c a t o r l o c = new S e r v i c e s i n f o L o c a t o r ( ) ;<br />

11 S e r v i c e s i n f o S o a p port = l o c . g e t s e r v i c e s i n f o S o a p ( ) ;<br />

12 System . out . p r i n t l n ( port . g e t c u r r e n t w o r k e r ( ) ) ;<br />

13 }<br />

14 catch ( Exception e )<br />

15 {System . out . p r i n t l n ( e . getMessage ( ) ) ; }<br />

16 }<br />

17 }<br />

The actual (synchronous) call to <strong>the</strong> web-service happens in line 12. This<br />

line could also contain far more complex calls, <strong>for</strong> example including objects<br />

as input and output parameters. Thanks to <strong>the</strong> trans<strong>for</strong>mation services (e.g.<br />

within <strong>the</strong> WSDL2Java tool) parameters send to and received from <strong>the</strong> webservice<br />

are mapped to Java data types.<br />

6.2.3 Integrating External Web Services on <strong>the</strong> Example <strong>of</strong><br />

Protein Identification<br />

Background<br />

One key issue in proteomics is to identify proteins and characterize <strong>the</strong>ir expressions<br />

in cells. In mass-spectrometry based proteomics this is done by<br />

ei<strong>the</strong>r peptide mass fingerprinting (PMF) <strong>of</strong> MS 1 spectra or by fur<strong>the</strong>r fragmenting<br />

single peptides producing MS 2 spectra where (ideally) <strong>the</strong> amino acid<br />

sequence can be derived from.<br />

The PMF approach (also known as protein fingerprinting) is an analytical<br />

technique <strong>for</strong> protein identification developed in <strong>the</strong> early 1990s. The<br />

basic idea is to digest an unknown protein <strong>of</strong> interest by a sequence specific<br />

protease (such as Trypsin). The set <strong>of</strong> resulting peptides (fragments) build a<br />

unique identifier (fingerprint) <strong>of</strong> <strong>the</strong> unknown protein based on this protease<br />

and subsequently compared to databases containing known fragmentation patterns<br />

<strong>for</strong> this protease. Obviously, <strong>the</strong> mass accuracy to which <strong>the</strong> peptides<br />

are measured plays a crucial role (Green et al., 1999).<br />

In MS 2 spectra analysis peptides <strong>of</strong> interest - identified during a MS 1<br />

run - are fragmented fur<strong>the</strong>r in a collision cell to produce tandem (MS/MS,<br />

MS 2 ) mass spectra. Since fragmentation (usually) happens at <strong>the</strong> backbone<br />

peptide bonds, by putting toge<strong>the</strong>r matching pieces (that result in <strong>the</strong> full<br />

peptide) and analyzing <strong>the</strong> point <strong>of</strong> rupture (in principle) determination <strong>of</strong><br />

<strong>the</strong> amino acids gets possible. This approach is called De Novo sequencing<br />

(see e.g. (Ma et al., 2003; Halligan et al., 2005) and references <strong>the</strong>rein). The<br />

second large class <strong>of</strong> algorithms <strong>for</strong> <strong>the</strong> identification problem is based on<br />

comparing <strong>the</strong> experimental spectrum against a database <strong>of</strong> <strong>the</strong>oretical spectra<br />

determined by in silico digestion and fragmentation <strong>of</strong> known proteins.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!